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On the Role of Estimate-and-Forward with 
Time-Sharing in Cooperative Communication 
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Abstract 

In this work we focus on the general relay channel. We investigate the application of estimate-and-forward (EAF) to 
different scenarios. Specifically, we consider assignments of the auxiliary random variables that always satisfy the 
feasibility constraints. We first consider the multiple relay channel and obtain an achievable rate without decoding 
at the relays. We demonstrate the benefits of this result via an explicit discrete memoryless multiple relay scenario 
where multi-relay EAF is superior to multi-relay decode-and-forward (DAF). We then consider the Gaussian relay 
channel with coded modulation, where we show that a three-level quantization outperforms the Gaussian quantization 
commonly used to evaluate the achievable rates in this scenario. Finally we consider the cooperative general broadcast 
scenario with a multi-step conference. We apply estimate-and-forward to obtain a general multi-step achievable rate 
region. We then give an explicit assignment of the auxiliary random variables, and use this result to obtain an explicit 
expression for the single common message broadcast scenario with a two-step conference. 

I. Introduction 

The relay channel was introduced by van der Meulen in 1971 [1]. In this setup, a single transmitter with channel 
input X n communicates with a single receiver with channel output Y n , where the superscript n denotes the length 
of a vector. In addition, an external transceiver, called a relay, listens to the channel and is able to output signals 
to the channel. We denote the relay output with Y[ l and its input with X[ l . This setup is depicted in figure ^ 

A. Relaying Strategies 

In [2] Cover & El-Gamal introduced two relaying strategies commonly referred to as decode-and-forward (DAF) 
and estimate-and-forward (EAF). In DAF the relay decodes the message sent from the transmitter and then, at the 
next time interval, transmits a codeword based on the decoded message. The rate achievable with DAF is given in 
[2, theorem 1]: 

Theorem 1: (achievability of [2, theorem 1]) For the general relay channel any rate R satisfying 

R < min {I(X, X 1 -Y),I{X- Y 1 (1) 
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Fig. 1. The relay channel. The encoder sends a message W to the decoder. 



Y n 



Decoder 



ft 



for some joint distribution p(x, X\, y, y\) = p{x, X\)p{y, X\), is achievable. 

We note that for DAF to be effective, the rate to the relay has to be greater than the point-to-point rate i.e. 

I{X;Y i \X 1 )>I{X;Y\X 1 \ 



(2) 



otherwise higher rates could be obtained without using the relay at all. For relay channels where DAF is not useful 
or not optimal, [2] proposed the EAF strategy. In this strategy, the relay sends an estimate of its channel input to the 
destination, without decoding the source message at all. The achievable rate with EAF is given in [2, theorem 6]: 
Theorem 2: ([2, theorem 6]) For the general relay channel any rate R satisfying 



R < IiX-YYilX!), 
subject to I(Xr,Y) > J(Yi; Y^X-,,, Y), 



(3) 
(4) 



for some joint distribution p(x,Xi,y,yi,yi) = p(x)p(xi)p(y, yi\x, xi)p(y\\yi, x\), where ||3^i|| < oo, is achiev- 
able. 

Of course, one can combine the DAF and EAF schemes by performing partial decoding at the relay, thus obtaining 
higher rates as in [2, theorem 7]. 



B. Related Work 

In recent years, the research in relaying has mainly focused on multiple-level relaying and the MIMO relay 
channel. In the context of multiple-level relaying based on DAF, several DAF variations were considered. In [3] 
Cover & El-Gamal's block Markov encoding/succesive decoding DAF method was applied to the multiple-relay 
case. Later work [4], [5] and [6] applied the so-called regular encoding/sliding-window decoding and the regular 
encoding/backward decoding techniques to the multiple-relay scenario. In [7] the DAF strategy was applied to 
the MIMO relay channel. The EAF strategy was also applied to the multiple-relay scenario. The work in [8], for 
example, considered the EAF strategy for multiple relay scenarios and the Gaussian relay channel, in addition to 
considering the DAF strategy. Also [9] considered the EAF strategy in the multiple-relay setup. Another approach 
applied recently to the relay channel is that of iterative decoding. In [10] the three-node network in the half-duplex 
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regime was considered. In the relay case, [10] uses a feedback scheme where the receiver first uses EAF to send 
information to the relay and then the relay decodes and uses DAF at the next time interval to help the receiver 
decode its message. Combinations of EAF and DAF were also considered in [11], where conferencing schemes 
over orthogonal relay-receiver channels were analyzed and compared. Both [10] and [11] focus on the Gaussian 
case. 

An extension of the relay scenario to a hybrid broadcast/relay system was introduced in [12] in which the authors 
applied a combination of EAF and DAF strategies to the independent broadcast channel with a single common 
message, and then extended this strategy to the multi-step conference. In [13] we used both a single-step and a 
two-step conference with orthogonal conferencing channels in the discrete memoryless framework. A thorough 
investigation of the broadcast-relay channel was done in [14], where the authors applied the DAF strategy to the 
case where only one user is helping the other user, and also presented an upper bound for this case. Then, the fully 
cooperative scenario was analyzed. The authors applied both the DAF and the EAF methods to that case. 

C. The Gaussian Relay Channel with Coded Modulation 

One important instance of the relay channel we consider in this work is the Gaussian relay channel with 
coded modulation. This scenario is important in evaluating the rates achievable with practical communication 
systems, where components in the receive chain, such as equalization for example, require a uniformly distributed 
finite constellation for optimal operation. In Gaussian relay channel scenarios, most often three types for relaying 
techniques are encountered: 

• The first technique is decode-and-forward. This technique achieves capacity for the physically degraded 
Gaussian relay channel (see [2, section IV]), and also for more general relay channels under certain conditions 
(see [11]). 

• The second technique is estimate-and-forward, where the auxiliary variable Y\ is assigned a Gaussian distribu- 
tion. For example, in [15, section IV] a Gaussian auxiliary random variable (RV) is used in conjunction with 
time-sharing at the transmitter, and in [16] the ergodic capacity for full duplex transmission with Gaussian 
EAF is obtained. 

• The third technique is linear relaying, where the relay transmits a weighted sum of all its previously received 
inputs [15, section V]. An important subclass of this family of relaying functions is when the relay transmits 
a scaled version of its input. This method is called amplify-and-forward [17], and was later combined with 
DAF to produce the decode-amplify-and-forward method of [18]. 

Several recent papers consider the Gaussian relay channel with coded modulation. In [19] the author considered 
variations of DAF for different practical systems. In [17] DAF and amplify-and-forward were considered for coherent 
orthogonal BPSK signalling, and in [20] a practical construction that implements a half-duplex EAF coding scheme 
was proposed. 

As indicated by several authors (see [15]) it is not obvious if a Gaussian relay function is indeed optimal. In this 
paper we show that for the case of coded modulation, there are scenarios where non-Gaussian assignments of the 



February 1, 2008 



DRAFT 



4 



auxiliary RV result in a higher rate than the commonly applied Gaussian assignment. 

D. Main Contributions 

In the following we summarize the main contributions of this work: 

• We give an intuitive insight into the relay channel in terms of information flow on a graph, and show how 
to obtain [2, theorem 6] from flow considerations. Using flow considerations we also obtain the rate of the 
EAF strategy when the receiver uses joint-decoding. A similar expression can be obtained by specializing the 
result of [22] to the case where the relay does not perform partial decoding. We then show that joint-decoding 
does not increase the maximum rate of the EAF strategy, and find the time-sharing assignment that obtains the 
joint-decoding rate from the general EAF expression. We also present another time-sharing assignment that 
always exceeds the joint-decoding rate. 

• We introduce an achievable rate expression for the multiple relay scenario based on EAF, that is also practically 
computabe. As discussed in section iFAl in the "noisy relay" case EAF outperforms DAE However, for the 
multiple relay scenario there is no explicit, computationally practical expression based on EAF that can be 
compared with the DAF-based result presented in [5], so that the best strategy can be selected. As indicated in [8, 
remark 22, remark 23], applying general EAF to a network with an arbitrary number of relays is computationally 
impractical due to the large number of constraints that characterize the feasible region. Therefore, it is interesting 
to explore a computationally simple assignment that allows to derive a result that extends to an arbitrary number 
of relays. We also provide an explicit numerical example to demonstrate that indeed there are cases where 
multi-relay EAF outperforms the multi-relay DAE 

• We consider the optimization of the EAF auxiliary random variable for the Gaussian relay channel with an 
orthogonal relay. We consider the coded modulation scenario, and show that there are three regions: high 
SNR on the source-relay link, where DAF is the best strategy, low SNR on the source-relay link in which the 
common EAF with Gaussian assignment is best, and an intermediate region where EAF with hard-decision per 
symbol is optimal. For this intermediate SNR region we consider two kinds of hard-decisions: deterministic 
and probabilistic, and show that each one of them can be superior, depending on the channel conditions. 

• Lastly, we consider the cooperative broadcast scenario with a multi-step conference. We present a general rate 
region, extending the Marton rate region of [21] to the case where the receivers hold a X -cycle conference 
prior to decoding the messages. We then specialize this result to the single common message case and obtain 
explicit expressions (without auxiliary RVs) for the two-step conference. 

The rest of this paper is organized as follows: in section |ll] we discuss the single relay case. We consider the 
EAF strategy with time-sharing (TS) and relate it to the EAF rate expression for joint-decoding at the destination 
receiver. In section|^we present an achievable region for the multiple-relay channel, and in section lTVl we examine 
the Gaussian relay channel with coded modulation. In section [V] we investigate the general cooperative broadcast 
scenario, and obtain an explicit rate expression by applying TS-EAF to the general multi-step conference. Finally, 
section fVll presents concluding remarks. 
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II. Time-Sharing for the Single-Relay Case 



A. Definitions 



First, a word about notation: we denote discrete random variables with capital letters e.g. X, Y, and their 
realizations with lower case letters x, y. A random variable X takes values in a set X. We use \\X\\ to denote 
the cardinality of a finite discrete set X, and Px(%) denotes the probability distribution function (p.d.f.) of X 
on X. For brevity we may omit the subscript X when it is obvious from the context. We denote vectors with 
boldface letters, e.g. x, y; the i'th element of a vector x is denoted by Xi and we use x^ where i < j to denote 
(xi, Xi+i, Xj-x, Xj). We use At (X) to denote the set of e-strongly typical sequences w.r.t. distribution px (x) 
on X, as defined in [23, ch. 5.1] and A{ n \x) to denote the e-weakly typical set as defined in [24, ch. 3], 

We also have the following definitions: 

Definition 1: The discrete relay channel is defined by two discrete input alphabets X and X\, two discrete output 
alphabets y and and a probability density function p(y,y\\x,x\) giving the probability distribution on y x y x 
for each (x, x\) € X x X\. The relay channel is called memoryless if the probability of a block of n transmissions 
is given by p(y,yi|x,xi) = Y\Z =1 p{yi,yi,i\xi,xi,i). 
In this paper we consider only the memoryless relay channel. 

Definition 2: A (2 nR ,n) code for the relay channel consists of a source message set W = {l, 2, 2 nii }, a 
mapping function / at the encoder, 



where the i'th relay function ij maps the first i — 1 channel outputs at the relay into a transmitted relay symbol at 
time i. Lastly we have a decoder 



Definition 3: The average probability of error for a code of length n for the relay channel is defined as 



/ : X n , 



a set of n relay functions 



xi,i = U (2/1,1,2/1,2, ... 



.9 : y n i — ► W. 



pW=PT(g(Y n )^W), 



where W is selected uniformly over W. 

Definition 4: A rate R is called achievable if there exists a sequence of (2 



■ nil 



n) codes with P e 



(») 



as 



n 



X). 



B. 77ze Single Relay EAF with Time-Sharing 



Consider the following assignment of the auxiliary random variable of theorem |2] 




(5) 
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Under this assignment, the feasibility condition of (0) becomes 
I(X i; Y) > I(Y 1 -Y 1 \X 1 ,Y) 

= H(Yi\Xi,Y) - H(Y 1 \X ll Y,Y 1 ) 

= H<Xx\X x ,Y) - (1 - q)H(Y 1 \X 1 ,Y) - qHiY^^Yx) 
= qH(Yi\Xi,Y), 

and the rate expression (0 becomes 

R < nx-Y^x^ 

= I{X\Y\X X ) + I{X;Y x \Xx,Y) 

= I(X; Y\Xi) + H{X\X U Y) - H(X\X U Y, Yi) 

= I(X; Y\Xx) + H{X\X U Y) - (1 - q)H{X\X u Y) - qH{X\X u Y, Y x ) 

= I{X\Y\X X ) + ql&iYAX^Y). 

Clearly, maximizing the rate implies maximizing q subject to the constraint q 6 [0,1]. This gives the following 
corollary to theorem [2] 

Corollary 1: For the general relay channel any rate R satisfying 



R<I(X;Y\Xi) 



/(XjFilXi.Y), (6) 



H{Y l \X 1 ,Y)_ 

for the joint distribution p(x,xi,y,yi) = p(x)p(xi)p(y, yi\x, xi), with [x]* = min(a;, 1), is achievable. 
Now, consider the following distribution chain: 

p(x, x 1 ,y,y 1 ,y 1 ,y 1 ) = p(x)p(x 1 )p(y, y x \x, x 1 )p{y 1 \xr, yi)p(yi\yi). (7) 

We note that this extended chain can be put into the standard form by letting p(yi\xi, y\) = J2y ± Pivii Hi) = 

p{yi\%ii yi)p{yi\vij- After compression of Y\ into Y\, there is a second compression operation, compressing 
Y\ into Y\. The output of the second compression is used to facilitate cooperation between the relay and the 
destination. Therefore, the receiver decodes the message based on yi and y, repeating exactly the same step as in 
the standard relay decoding, with y replacing y. Then, the expressions of theorem |2] become 

R < J(X;y, YilJCi), (8) 
subject to I{X r ,Y) > KY^Y^X^Y). (9) 

Now, applying TS to Y\ with 

,s r , J ? ,m = m 
pwm) = s » . > 

l-q ,y 1 = A(£y 1 
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■7(X;Y 1 |X 1 ,y) 

■ HiXlX^Y) - HiX^X^Y) 
q(H(X\X u Y) - H(X\Y±, Xi, Y)) 
■qliX&lX^Y), 



the expressions in l|8} and l|9j become 

R < I[X\Y\Xx) 

= J(A-;y|Xi) 

/(^i;^) > /(yi;yi|x lf y) 

= H(Y 1 \Xi,Y) — (1 — g)i/(Yi l-Xi, F) - gif(Yi |Yj ,X U Y) 
= qI{y V ,%\X u Y). 

Combining this with the constraint q 6 [0, 1] we obtain the following corollary to theorem |2j 



Proposition 1: For the general relay channel, any rate R satisfying 

I(Xi;Y) 



R < I(X;Y\Xi) 



/(x^iXi.y), 



(11) 



(12) 



_I(Y 1 ;Y 1 \X 1 ,Y)_ 

for some joint distribution p(x,xx,y,yi,yi) = p(x)p(xi)p(y, yi\x, xi)p(yi\xi, y\), is achievable. 

This proposition generalizes on corollary ^by performing a general Wyner-Ziv (WZ) compression combined with 
TS (which is a specific type of WZ compression), intended to guarantee feasibility of the first compression step. 
In section IIVI we apply a similar idea to the EAF relaying in the Gaussian relay channel scenario with coded 
modulation. Before we discuss the relationship between joint-decoding and time-sharing we present an intuitive 
way to view the EAF strategy. 



C. An Intuitive View of Estimate-and-Forward 

Consider the rate bound and the feasible region of theorem |2] given in equations (|3j and (|4}. We note that the 
following intuitive explanation does not constitute a proof but it does provide an insight into the relay achievability 
results. We emphasize that the achievable rates stated in this section can also be proved rigorously. In the following 
we provide an intuitive insight into these expressions in terms of a flow on a graph. 

In constructing the intuitive information flow representation for the relay channel, we first need to specify the 
underlaying assumptions and the operations performed at the source, the relay and the destination receiver: 

• The source and the relay generate their codebooks independently. 

• The relay compresses its channel output yi into yi, which represents the information conveyed to the destination 
receiver to assist in decoding the source message. 

• Based on the above two restrictions we have the following Markov chain: p(x)p{x\)x{y , yi\x, xi)p(yi\xi, yi). 

• The relay input signal xi is based only on the compressed yi. 

• The destination uses Xi, yi and y to decode the source message x. 
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We also use the following representation for transmission, reception and compression: 

• We represent an information source as a source whose output flow is equal to its information rate. 

• We represent the compression operation as a flow sink whose flow consumption is equal to the mutual 
information between the original and the compressed sequences. 

• The destination is represented as a flow sink. 

• As in a standard flow on a graph, the flows are additive, following the chain rule of mutual information. 
Now consider the following flow diagram of figure [2] As can be observed from the figure, the source has an 
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Fig. 2. The information flow budget for the general relay channel with compression at the relay. 



output flow of 

i T = I(X: Y,Y 1 ,X 1 ) = I(X; Y, Y 1 \X 1 ). 

This follows from the fact that the destination uses xi , yi and y to decode x and the fact that X and X\ are 
independent. This total flow reaches the receiver through two branches, the direct branch (D) which carries a flow 
of id = I(X;Y\Xi) and the relay branch (ABCE). Now, the quantities in the relay branch are calculated given 
X\ and Y to represent only the rate increase over the direct path. The relay branch has four parts: an edge (A) 
which carries a flow of I(X; Y±\Xi, Y), a sink (B) with consumption I{Y\\Y\\X\,Y), a relay source (C) with 
an output flow of I(Xi ; Y) and an edge (E) from the relay to the destination. Here, the relay transmission to the 
destination (C) is done at a fixed rate I(X\; Y), independent of the type of compression p(yi\xi) used at the relay, 
since we always transmit from the relay to the destination at the maximum possible rate in order to obtain the 
best performance. The rate loss due to compression is represented by J(Yi; Yi|Xi, Y), since we consider only the 
excess rates over the direct one. 

Now, from the laws of flow addition and conservation, the overall flow from the source to the destination through 
the relay branch is ie — ia + «s + ic- To assist the direct link (D) we need the flow on (ABCE) to be positive. 
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In theorem |2 the scheme considers only the last two elements, is +ic> ar >d verifies that their net flow is positive, 
namely 

-I(Y x ;Y 1 \X 1 ,Y)+I(X 1 ;Y)>0. (13) 

This condition guarantees a net positive flow on (ABCE) since always %a > 0. Now, the flow to the destination 
can be obtained as the minimum 

R<min{i D +i E ,i T }, (14) 

where, the second term in the minimum is obtained from the transmitter, since trivially the information rate at the 
receiver cannot exceed iy. We note that because ig + %c > 0, the minimum in J14b is it- Therefore, the resulting 
achievable rate is 

R< UX-Y^X^, 

which combined with dl 31 gives the result of [2, theorem 6]. 

However, the condition in Jl 3I > is not tight since even when %B+ic < the flow on (ABCE) is still non-negative 
if the entire sum %a + is + ic is non-negative, i.e. 

I(X; Yi |*i, Y) - J(Yi ;Y 1 \X 1 ,Y)+ I(X 1 ; Y) > 0. (15) 

Then, the achievable rate to the destination is bounded by 

R < id + = I(X; Y\X X ) + I(Xi;Y) - I^Y^X, X 1 , Y). (16) 

Indeed, when the flow through the relay branch (ABCE) is zero we obtain the non-cooperative rate I(X;Y\Xi). 
Plugging the expression ( 1161 into (II 4i yields the following achievable rate: 

R < rtun{i D +iE,ir} 

= min [l(X; F|Xi) + I(Xr,Y) - 7(Fi;li|X, X U Y),I(X; Y, fi|Xi)} 
= I(X; Y\Xx) + min {l(X V ,Y) - I^Y^X, X X ,Y),I{X; Yi\X\, F)| . 

Combining this with (1151 . (informally) proves the following proposition: 
Proposition 2: For the general relay channel, any rate R satisfying 

R<I(X;Y\X 1 )+ min {/(Xj ; Y) - I{Y X ;Y 1 \X,X l ,Y), I(X; Y, \X U Y) } , 

subject to I (Xx-Y) > lfo;Y 1 \X,X 1 ,Y) = I(? 1 -,Y 1 \X 1 ,Y)-I(X-,? 1 \X 1 ,Y), 

for some joint distribution p{x, xi,y, yi, yi) = p(x)p(xi)p(y, yi\x, xi)p(yx\xi, y±), is achievable. 

The proof of proposition [2] can be made formal using joint-decoding at the destination receiver, but as in the next 
subsection we show that this expression is a special case of [2, theorem 6] obtained by time-sharing, we omit the 
details of the proof here. 
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D. Joint-Decoding and Time-Sharing 

In the original work of [2, theorem 6], the decoding procedure at the destination receiver for decoding the message 
Wi-i at time i is composed of three steps (the notations below are identical to [2, theorem 6]. The reader is referred 
to the proof of [2, theorem 6] to recall the definitions of the sets and variables used in the following description): 

1) Decode the relay index s, using y(i), the received signal at time i. 

2) Decode the relay message Zj_i, using Sj, the received signal y(i — 1) and the previously decoded Sj_i. 

3) Decode the source message Wi—i using y(i — 1), Zi—\ and Sj_i. 

Evidently, when decoding the relay message Zj_i at the second step, the receiver does not make use of the 
statistical dependence between yi(i — 1), the relay sequence at time i — 1, and x(wj_i), the transmitted source 
codeword at time i — 1. The way to use this dependence is to jointly decode Zj_i and Wi-i after decoding si and 
Si_i. The joint-decoding procedure then has the following steps: 

1) From y(i), the received signal at time i, the receiver decodes s, by looking for a unique s E S, the set of 
indices used to select xi, such that (xi(s), y(i)) € A* . As in [2, theorem 6], the correct s,; can be decoded 
with an arbitrarily small probability of error by taking n large enough as long as 

Ro<I(X i; Y), (17) 

where = 2 nRo . 

2) The receiver now knows the set S Si into which Zi-i (the relay message at time i — 1) belongs. Additionally, 
from decoding at time i — 1 the receiver knows Sj_i, used to generate Zi-\. 

3) The receiver generates the set C(i — 1) = |w € W : (x(w),y(i — l),xi(si_i)) € Aj^ n ^ j-. 

4) The receiver now looks for a unique w G £(i — 1) such that (x(tt)),y(i — l),yi(z|s,_x),xx(si-x)) € A^™' 
for some z G If such a unique u> exists then it is the decoded Wi-i, otherwise the receiver declares an 
error. 

We do not give here a formal proof for the resulting rate expression, but as indicated in section IH-CI the rate 
expression resulting from this decoding procedure is given by proposition |2] 

Let us now compare the the rates obtained with joint-decoding (proposition [2} with the rates obtained with the 
sequential decoding of [2, thoerem 6]: to that end we consider the joint-decoding result of proposition |2] with the 
extended probability chain of (0: 

p(x,xx,y,yi,yi,yi) = p(x)p(x 1 )p(y,y 1 \x,x 1 )p(y 1 \xi,yi)p(yi\yi), 

where Y\ represents the information relayed to the destination. Expanding the expressions of proposition [2] using 
the assignment dlOi . similarly to proposition ^ we obtain the expressions: 

R < IiXiYlXJ +min {l(X i; Y) - qliY^YxlXiX^^IiX^Y^Y)} (18) 

subject to I(Xr,Y) > qI(Y 1 ;Y 1 \X,X 1 ,Y) = q(l(Y li Y 1 \Xi,Y)-I(X;Y 1 \X u Y)y (19) 

We can now make the following observations: 
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1) Setting q = 1 we obtain proposition |2] Additionally, if I(X\;Y) > I(Yx;Yx\Xx,Y) then both proposition |2] 
and [2, theorem 6] give identical expressions. 

2) When q = 1 and 

I(Yx;Yx\Xx,Y) - I{X;Yx\Xx,Y) <I{X 1 ;Y) < I(Yx;Yx\Xx,Y), (20) 

then for the same mapping p(yi\x\, y±) we obtain that proposition |2] provides rate but [2, theorem 6] does 
not. The rate expression under these conditions is 

R < I(X; Y\X{) + I(X\;Y) - I(Yx;Yx\X, X 1} Y). (21) 

3) Now, fix the probability chain p(x)p(xi)p(y , yi\x, xx)p(yx\xi,yi) and examine the expressions dl 81 and dl 91 
when J20i holds: when q < 1, then (I20> guarantees that condition dl9l is still satisfied. If q is close enough 
to 1 such that we also have I(X\;Y) < qI(Yx;Yx\Xx,Y), the rate from fj"8l. i.e., 

R < I{X- Y\X X ) + I{Xv,Y) - qI{Yr, Fj \X, X lt Y), 

is now greater than (12 1 i . In this case can keep decreasing q until 

7(x i; y)- g 7(y 1 ;y 1 |x,x 1) y) = g /(X;y 1 |x 1 ,y) (22) 

at which point the rate becomes 

j?</(x ; y|x 1 ) + g /(x ; y 1 |x 1) y). (23) 

This rate can be obtained from [2, theorem 6] by applying the extended probability chain of (0, as long as 

I{X x ;Y)>qI(X x ,Yi\X x ,Y). 

We therefore conclude that all the rates that joint decoding allows can also be obtained or exceeded by the original 

EAF with an appropriate time sharing 1 . 

Note that equality in 1221 implies 

. L i(x i; y) 1 . J I(Xv,Y) \ 

q pt = nun < 1, — ■ — > = mm < 1, 



7(yi;y 1 |x J x 1) y) + 7(x ; y 1 |x 1 ,y)J { /(y^y^y) J 

hence q opt is the maximum q that makes the mapping p{yx\xi,yi) feasible for [2, theorem 6]. Plugging q opt into 
d23l . we obtain the rate expression of proposition ^ 

Finally, consider again the region where joint decoding is useful d20l >: 

/(y^lx.Xi.y) <i(x i; y)< i^y^y) 

=> < I{Xr,Y) - 7(y i; Yx\X, Xx,Y)< I{Yx;Yx\Xx,Y) - I(Yx;Yx\X, X U Y) 
^0 < I(Xx;Y) - I(Yx;Yx\X,Xx,Y) < I(Xx;Yx\Xx,Y) 

'This argument is due to Shlomo Shamai and Gerhard Kramer. 
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If I(X;Yi\Xi,Y) > 0, then using time-sharing on Y\ with 

= I(X 1 ;Y)-I(Y 1 ;Y 1 \X,X 1 ,Y) 
q I[X;Yx\X u Y) 

into equations ( II 11 and H2\ yields: 

I(X- Y\Xx) + qI(X; Y x \X lt Y) = I(X; Y\X t ) + I(X i; Y) - /(Yi;>i|X, X U Y), 



as long as I(Xi\Y) > qI(Yi;Yx\Xi,Y), or equivalently 

7(yi;Yi|Xi,y) 

Plugging assignment i24l into J25l > we obtain: 



I(X i; Y) 

1 ^ 77K „ ,„ 77; • ( 25 ) 



7(X 1 ;y)-7(y i; y 1 |X ! X 1 ,y) < ^(Xi;F) 



7(X ; y 1 |X 1 ,y) " I(Y 1 ;Y 1 \X 1 ,Y) 

\i{x v ,y) -/(yi;yi|x,Xi,y)) 7(y 1 ;y 1 |x 1 ,y) < 7(x 1 ;y)7(x ; y 1 |x 1 ,y) 
7(x i; y)/(y 1 ;y 1 |x 1) y)-7(x 1 ;y)7(x ; y 1 |x 1) y) < 7(y i; y 1 |x ) x 1 ,y)7(y i; y 1 |x 1 ,y) 
7(Xi ; y)7(y i; y 1 |x,Xi,y) < 7(y i; y 1 |x,x 1 ,y)7(yi ; yi|x 1 ,y) 
7(x i; y) < 7(y i; y 1 |x 1 ,y), 

as long as I(Yi;Yi\X, Xi,Y) > 0, which is the region where joint-decoding is supposed to be useful. Hence the 
joint-decoding rate of proposition |3] can be obtained by time sharing on the [2, theorem 6] expression. Therefore, 
joint-decoding does not improve on the rate of [2, theorem 6]. In fact the rate of proposition [2 is always at least 
as large as that of proposition [2] 

III. An Achievable Rate for the Relay Channel with Multiple Relays 

When the source-relay channel is very noisy then, as discussed in the introduction, it may be better not to use 
the relay at all than to employ the decode-and-forward strategy. Alternatively, when decode-and-forward is not 
useful, one could employ estimate-and-forward. One result for multiple relays based on EAF can be found in [9] 
which considered the two-relay case. In [8, theorem 3] the EAF strategy, with partial decoding was applied to 
the multiple-relay case, and in [8, theorem 4] a mixed EAF and DAF strategy was applied. However, as stated 
in [8, remark 22, remark 23] applying the general estimate-and-forward to a network with an arbitrary number of 
relays is computationally impractical due to the large number of constraints that characterize the feasible region 
(for two relays we need to satisfy 9 constraints). Moreover, the rate computation is prohibitive since it would 
imply solving a non-convex optimization problem. In conclusion, an alternative achievable rate to that based on 
decode-and-forward, which can also be evaluated with a reasonable effort, has not been presented to date. In this 
section we derive an explicit achievable rate based on estimate-and-forward. The strategy we use is to pick the 
auxiliary random variable such that the feasibility constraints are satisfied. This is not a trivial choice since setting 
the auxiliary random variable in theorem |2] to be the relay channel output (i.e. Y\ = Y\) does not remove this 
constraint, and we therefore need to incorporate time-sharing as discussed in the following. 
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A. A General Achievable Rate 

We extend the idea of section llT-Bl to the relay channel with N relays. This channel consists of a source with chan- 
nel input X, N relays where for relay i, Xi denotes the channel input and Yj, denotes the channel output, and a des- 
tination with channel output Y. This channel is denoted by [X x£Lj Xi,p(y,yi, yy\x, Xi, ...,XN),y x|Li D^i)- 
Let X = (X\, X2, Xn) and Y = (Y±, Y2, Yn), We now have the following theorem: 

Theorem 3: For the general multiple- relay channel with N relays, ^X xfL± Xi,p(y, y\, yx\x, X\, Xjy), 
y x.f =1 yi ), any rate R satisfying 



R<I(X;Y\X) + Y, P(Bin N (e))I(X;Y BmNig) \X,Y), 

6=1 

where BinN{0) is an N -element vector that contains '1' in the locations where the N-bit binary representation of 
the integer 6 contains 'V, P(Bin^(9)) = Yli Bin N (e) =o(^ ~ Qi) Y[iBin N (e) =1 1 ir Bin^{9)i is the i'th bit in the 
N-bit binary representation of 9, ^Bin N (9) = ffin^i •••>^jm)' where i\, i 2 , %m are the locations of the 'V 
in Bin^(9), and 



/(X j; F|Zi) 



H(Y\X,Y) - E-Jr 1 P,(Bin L ,(j))I(Y i; % B in r ,H)& t )\X,Y) 



(26) 



for the joint distribution p(x, x\, x 2 , x N , y, yi, y 2 , —, yjv) = p{x)p{x\)...p{x N )p{y, yi, y N \x, x 1} x N ) is 
achievable. In J26t i Z^ is the vector containing all the variables Xj decoded prior to decoding Xi, Tj is a vector 
that contains all the variables Y p decoded prior to decoding Yi, and Y ; , Bm ^(T.;) contains all the Y^ , such 
that Yy G Ti, and r is a location of '1' in the L\-bit binary representation of j. L\ if the number of elements in 
TV Note that if Y p G then we must have X p G Z^. 

To facilitate the understanding of the expressions in theorem [5] we first look at a simplified case where the 
destination decodes each relay message independently of the messages of the other relays. This can be obtained 
from theorem[3]by setting Z^ = and = 0, i = 1, 2, N, The result is summarized in the following corollary: 

Corollary 2: For the general multiple-relay channel {X x^L 1 Xi,p(y,y\, ...,y^\x,Xi, ...,X]^),y x^L 1 yi), any 
rate R satisfying 

2 N -1 

R < I(X;Y\X) + ]T P(Bin N (6))I(X;Y BinN{e) \X,Y), (27) 

9=1 

is achievable, where 

■ m- t Y) r (28) 



H(Y\X,Y) 

for the joint distribution p(x, Xi, x 2 , —, x N , y, yi,y 2 , —, Vn) = p(x)p(xi)...p(x N )p(y, yx, yw\x, x\, ... , x N ). 

In the multi-relay strategy we employ in this section each relay transmits its channel output Yi with probability 
qi, independent of the other relays. Therefore, when considering a group of N relays, the probability that any 
subgroup of relays will transmit their channel outputs simultaneously is simply the product of all transmission 
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probabilities qi at each relay in the group, multiplied by the product of erasure probabilities (1 — qt) for each relay 
in the complement group. Now, considering the rate expression of i27\ we observe that the rate is obtained by 
taking all possible groupings of relays. For each grouping the resulting rate is the rate obtained when using all the 
channel outputs of all the relays in that group to assist in decoding. This is indicated by the term Ygjjj/gs. This 
rate has to be weighted by the probability of such an overlap occurring, which is given by P(Bin/v(0))- We then 
sum over all such groupings to obtain the achievable rate. The parameter qi for each relay, which is determined by 
( 1281 . can be interpreted by considering the terms in the denominator and numerator: the denominator H(Y i \'K, Y) 
is the (exponent of the) size of uncertainty at the destination receiver about relay i's output Y^. The numerator is 
the (exponent of the) size of the information set that can be transmitted from relay i to the destination receiver. 
Therefore, the fraction gTyrrx Y) can ^ e interpreted as the maximal fraction of the uncertainty at the destination 
about relay i's channel output Yi, that can be compensated by the relay transmission. Of course, this faction has to 
be upper bounded by one. In the more general setup of theorem[3] the decoding of the relay information from relay 
i is done by using the information from the relays which were decoded before relay i to assist in decoding. This 
results in the conditioning at the numerator and the negative terms in the denominator, both contribute to increasing 
the value of qi. 

B. Proof of Theorem 

1) Overview of Coding Strategy: The transmitter generates its codebook independent of the relays. Next, each 
relay generates its own codebook independent of the other relays following the construction of [2, theorem 6], with 
the mapping p({ji\xi,yi) at each relay set to the time-sharing mapping of (0 with parameter qi. The destination 
receiver first needs to decode all the relay codewords {X™}f =1 and use this information to decode the relay messages 
{^i™} . To this end, the relay decides on a decoding order for the X" sequences and a decoding order for the 
YJ 1 sequences. These decoding orders determine the maximum value of qi that can be selected for each relay, 
thereby allowing us to determine the auxiliary variables' mappings and obtain an explicit rate expression. Finally, 
the receiver uses all the decoded {^"} i=1 and j. sequences, together with its channel input to decode the 
source message. 

We now give the details of the construction: fix the distributions p(x), p(x\), p(x2),---,p{xn), and 

p{yi\xi,yi) = < qi ,Vl 1 , (29) 
[1-qi ,yi = Q^yi 

i = 1, 2, ...,N. Let W = {1,2, 2 nR } be the source message set. 

2) Code Construction at the Transmitter and the Relays: 

• Code construction and transmission at the transmitter are the same as in [2, theorem 6]. 

• Code construction at the relays is done by repeating the relay code construction of [2, theorem 6] for each 
relay, where relay i uses the distributions p{yi\xi,yi) and p(xi). We denote the relay message, the transmitted 
message and the partition set at relay i at time k with z^k, and Ss^ k respectively. The message set for Sj is 
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denoted VVj, where ||W,|| ~ 2 nRi . The message set for Zi is denoted W$, 1 1 1 1 = 2 nRi . The relay codewords 
at relay i are denoted yi(zi\si), and the transmitted codewords at relay i are denoted Xi(si), Si £ Wi, Zi £ W[. 

3) Decoding and Encoding at the Relays: 
Consider relay i at time k — 1: 

• From the relay transmission at time k — 1, the relay knows s^fc-i. Now the relay looks for a message z% £ 
such that 

(fi(zi\si, k -i),yi(k- lj.xj^k.!)) g ^(F^y^Xi). 

Following the argument in [2, theorem 6], for n large enough there is such a message Zi with a probability 
that is arbitrarily close to 1, as long as 

Ri > I{ti;Yi\Xi) + e = qiHiYilXi) + e. (30) 

Denote this message with z^fc-i. 

• Let Si : k be the index of the partition of VV| into which Zi.k-i belongs, i.e., z^./t-i £ sif h . 

• At time k relay i transmits x^s^fc). 

4) Decoding at the Destination: 

• Consider the decoding of at time k, for a fixed decoding order: let Zj contain all the X/s whose s^'s 
are decoded prior to decoding s^.. Therefore, decoding s$ & is done by looking for a unique message s 4 S Wi 
such that 

(xj(s^), x mi (s m . 1; ^), x m2 (s m2j /j), x mM (s mM . ,^), y(fc)j £ ^(X^,Z^,y), 

where mi, m 2 ,...,mMi enumerate all the Xj's in Z.j = (X TO1 ,X m2 , ...X mM ). Assuming correct decoding at 
the previous steps, then by the point-to-point channel achievability proof we obtain that the probability of error 
for decoding Sj^ can be made arbitrarily small by taking n large enough as long as 

Ri < I(X t ;Y, Zt)-€= I{X t - Y\%) - e. (31) 

Let T; contain all the YJ/'s whose zy ,fc_i's are decoded prior to decoding Zj,fc_i. Note that all the {sj,fe-i}- =1 
were already decoded at the previous time interval when Wk-2 was decoded. 

• The destination generates the set 

Ci(k - 1) = jz, € W[ : (y(k - l),^i(^|3i,fc-i),^i' 1 (^i' 1 ,fc-i|si' 1) *-i) ! •••>yi' i , ( z i^,,fc-i|sz^,fc-i), 

x 1 ( Slife _ 1 ),x 2 (s 2)fc _ 1 ),...,x JV (siv,fc-i)) e^^Fi.T^X)}, (32) 

where l[, l' 2 ,...,l' L , enumerate all the Yy's in Tj. The average size of £i(k— 1) can be bounded using the standard 
technique of [2, equation (36)] and the fact that when z% ^ 2i,fc-i, then the corresponding yi{zi\si t k-i) is 
independent of all the variables in d32l > except x^Si fe-i)- The resulting bound is 

E{\\Ci{k - 1)||} < 1 + 2 »(^-^;^x_ i ,T i |x i )+3e) ) 
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where X_i is an N — 1 element vector that contains all the elements of X except X- t . 
* Now, the destination looks for a unique zj £ Ci(k — l)!")^***- Therefore, making the probability of error 
arbitrarily small by taking n large enough can be done as long as 

R'i < I(Yi]Y, X_<, Ti\Xi) + I(X t ;Y\Z t ) - 4e. (33) 

We note that using the assignment j29t we can write 

/(y i; r,x_,,f 4 |x 4 ) = ff(y,x_ i) T i |x i )-if(y,x_ j ,T i |x i ,y i ) 

= J7(r, x_,, f 4 |x 4 ) - (l - qi )H(Y, x_,, f ii^i) - 9iJ ff(y, x_,, t^, y) 

= 9i jT(Y, x_,, T.i^) - g j H(y J x_,, T<|Xi, Yi) 

= % 7(y ; y ) x_ i ,f i |x i ) 

= % ( J ff(y i |x i )-i/(y i |y,x_ i ,x i ,y ii ,fg)) 

= ^^.ffCyiXi) + (i - gri )iT(y|Xi) 

-g, ; j?(y|y,x_ i ,x i ,y, ; ,T5) - (i -^(yiyx^Y^ig)) 
= ^(^/(y ; y x_ i5 y, , f f'^Xi) + (1 - 9/; )/(y ; y x_ ls f gi^)) 

= 9l £ ^(Bin^oowy^yx^^Y^B^^^)!^), 

where P;/ (Bin L > (j)) = rir:Bin £ , (j) r =i 9l' r x E^Bin^ (j) r =oO- ~ Bin L > (j) r is the r-th bit of the L' r bit binary 
representation of j, and Y ; , g m ,y)(Ti) = fl/^ , Y^ , ...,Yj, ^j, ni, 712, Um are the locations of '1' in the 
L^-bit binary representation of j, and l' ni ,l' n2 , ■■■^'n M stre me indices of the Yj's in locations n-y,U2, ■■■,n>M in Tj. 
For example, if L\ = 3 and j = 3 then Bin 3 (3) = (1, 0, 1) and M = %n x = 1, ?i 2 = 3. Letting T, = fy 3 , Y 1 , Y 2 J 
then Zj = 3, / 2 = 1 and 1' 3 = 2, and 

Pr(Bin 3 (3)) = qi[ (l - qv 2 )qv 3 , 
Y,,Bin3(3)(T0) = (Y ilj Y i ,) = (y 3 ,y 2 ). 
5) Combining the Bounds on R'f Applying the above scheme requires that satisfies < I30I > and ( I33> : 

2- L 'i-l 

ft i/(y|Y l ) + e<^ ^(Bini^^Kiy^yx^^Y^Bi^^.^TOIxo + ilY^yizo-^, 

J=0 
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which is satisfied if 

I{Xi;Y\Zi)-5e 



qi < 



= 7(X t ;F|Z. t )-5e 

H(Yi\Xi) - 7(y i; y,x_ i |x i ) - zfJr 1 fl'CBin^O"))^; Y J , 3ini , (j . ) (*i)|x,y) 

_ 7(X t ;F|Z t )-5e 

f(F|x,f) - Efr 1 p*<(Bin^(j))/(y i; Y I/3ini{W (* i )|x > y) 

Combining with the constraint < qi < 1 gives the condition in i26\ . 

Finally, the achievable rate is obtained as follows: using the decoded {y , »(^i,fc— i|si,fc— i)} i=1 (assuming correct 
decoding of all {zi.k~i}f =1 ) the receiver decodes the source message Wk-i by looking for a message w G W such 
that 

(x(iy),yi(zi j fc_i|si i fc_i),y 2 (22,fc-i|s2,fc-i), , yjv(«iv,fc-i|sAr,fc-i), 

xxCsx.fc-i) , x a («2,*-i) , Xjv(sjv, fc -i), y(* - 1)) G A*™ (X, Y, X, F), 
where Y = ^Fi,F2, ■ ■■,Fv^. This results in an achievable rate of 

R< I(X;Y,Y,X) = I(X;Y,Y\X). 
Plugging in the assignments of all the Yi's, we get the following explicit rate expression: 
I(X;Y,Y\X) = I(X;Y\X)+I(X;Y\X,Y) 

= I(X;Y\X) + H(X\X,Y) - H(X\X,Y,Y) 

= I(X;Y\X)+H(X\X,Y) - (1 - q 1 )H(X\X, Y, Y%) - q 1 H(X\X, Y, Y% , Fi) 
= I(X: Y\X) + (1 - qi )I(X; Y» |X, F) + qi I(X; Y? , Y 1 |X, F) 

= /(X;F|X)+ ^ P(Bin w (c9))/(X;Y Binjv(9) |X,F). 



C. Discussion 

To demonstrate the usefulness of the explicit EAF-based achievable rate of theorem [3] we compare it with the 
DAF-based method of [5, theorem 3.1] for the two-relay case. For this scenario there are five possible DAF setups, 
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and the maximum of the five resulting rates is taken as the DAF-based rate: 

R DAF = sup msx{R 1 ,R 2 ,Ri2,R2i,RG} 

R x = max mm{I{X;Y 1 \X 1 ,x 2 ),I{X;Y\X 1 ,x 2 ) + I{X i; Y\x 2 )} 

X2&X2 

R 2 = max min{I(X;Y 2 \X 2 ,x 1 ),I(X;Y\X 2 ,x 1 )+I(X 2 ;Y\xi)} 

R 12 = min{I(X; Y^Xt, X 2 ), I{X; Y 2 \X U X 2 ) + I{X X \ Y 2 \X 2 ), I(X; Y\X U X 2 ) + I(X i; Y\X 2 ) + I{X 2 ;Y)} 

R 21 = min{/(X; Y 2 \X U X 2 ), I{X; Yi\Xx,X 2 ) + I(X 2 ; Y^XJ, I{X; Y\X U X 2 ) + I{X 2 ;Y\Xx) + I(X i; Y)} 

R G = mm{I(X-,Y 1 \X lt X2),I(X-,Y 2 \X 1 ,X 2 ),I(X t X 1 ,Xr,Y)}, 

where R\ is the rate obtained when only relay 1 is active, R 2 is the rate obtained when only relay 2 is active, R\ 2 
is the rate obtained when relay 1 decodes first and relay 2 decodes second and i? 2 i is the rate obtained when this 
order is reversed. Rq is the rate obtained when both relays form one group 2 . Now, as in the single-relay case, DAF 
is limited by the worst source-relay link. Therefore, if 

R PTP > max {iiX-Y^x^JtX-Y^x^}, (34) 

p(x\xi,x 2 ),{xi ,i 2 )6^i X X 2 

where R PTP = max p ( x \ Xl , X2 ),(x 1 ,x 2 )ex 1 xx 2 Y\%i, x 2 ) is the point-to-point rate, then it is better not to use 
[5, theorem 3.1] at all, but rather set the relays to transmit the symbol pair (xi,x 2 ) € X± x X 2 such that the 
point-to-point rate is maximized. However, the rate obtained using corollary |3 for the two-relay case is given by 

r ts~eaf < gup 7(x ; y|X 1) X 2 ) + gi (l- g2 )7(X;Y 1 |X 1 ,X 2 ,y) 

p(x)p(x 1 )p(x 2 ) 

+ (1 - qi )q 2 I(X; Y 2 \X U X 2 ,Y) + qi q 2 I(X; Y U Y 2 \X!,X 2 , Y), 

where qi and q 2 are positive and determined according to (1281 . This expression can, in general be greater than R PTP 
even when J34i holds, for channels where the relay to destination links are very good. Hence, this explicit achievable 
expression provides an easy way to improve upon the DAF-based achievable rates when the source-to-relay links 
are very noisy. 

To demonstrate this, consider the channel given in table |I] over binary RVs X, X\, X 2 , Y, Y\ and Y 2 . The 
channel distribution was constructed under the independence constraint 

p(y, Vi, Vz\x, xi,x 2 ) = p(yi\x, xi,x 2 )p(y 2 \x, x 1: x 2 )p(y\x, x 1 ,x 2 ,y 1 ,y 2 ), 

i.e. given the channel inputs, the two relay outputs are independent. This channel is characterized by noisy source- 
relay links, while the link from relay 1 to the destination has low noise. Therefore, DAF is inferior to the point- 
to-point transmission but EAF is able to exceed this rate, by giving up a small amount of rate on the direct link 
(compared to the point-to-point rate) and gaining more rate through the relays. The numerical evaluation of the 

2 In fact, since we take the supremum over all p.d.f.'s p(x, xi, X2) we do not need to explicitly include Ri and R2 in the maximization, but 
it is included here to provide a complete presentation. 
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TABLE I 

P(2/> 2/1 1 2/2 \x, Xl , X2) FOR THE EAF EXAMPLE. 



( T T1 TO I 


p(y, 2/1, V2\x, XI, X2) 


000 


001 


010 


Oil 


100 


101 


110 


111 


000 


8.0473 14e-2 


1.948360e-l 


2.041506e-l 


4.523933e-2 


2.423322e-l 


7.057734e-3 


1.310053e-l 


9.490483e-2 


001 


8.601616e-l 


6.6437 13e-2 


1.662897e-2 


1.937227e-2 


1.859104e-2 


1.741020e-2 


8.833 169e-4 


5.154431e-4 


010 


3.131504e-l 


1.821840e-l 


5.618147e-2 


1.522841e-l 


5.290856e-2 


1.555570e-l 


3.214581e-2 


5.558854e-2 


Oil 


5.183921e-3 


3.704625e-l 


1.641795e-2 


2.208356e-l 


1.660775e-3 


2.355928e-l 


9.590170e-4 


1.488874e-l 


100 


8.116746e-3 


8.139504e-3 


9.387860e-2 


1.736515e-2 


1.039350e-l 


7.3087 14e-3 


7.612555e-l 


7.612563e-7 


101 


4.824126e-2 


1.196128e-l 


1.705739e-l 


7.127199e-2 


4.631349e-2 


1.955324e-l 


1.928693e-l 


1.555848e-l 


110 


9.36732 le-2 


1.248830e-l 


1.873302e-l 


6.161358e-2 


5.827773e-2 


1.906660e-l 


1.589616e-l 


1.245946e-l 


111 


9.141272e-7 


9.141263e-l 


7.618061e-3 


3.435473e-2 


7.974830e-4 


4.117531e-2 


9.302643e-4 


9.969457e-4 



TABLE II 
Optimal distribution for DAF 



(x,x 1 ,x 2 ) 


p(x,x 1 ,X2) 


000 


5.698189907239905e-009 


001 


5.259061814752764e-017 


010 


4.301809992760095e-0()9 


011 


4.424193267301 109e-001 


100 


6.792096128437060e-009 


101 


4.740938235494830e-017 


110 


3.207903771562940e-0()9 


111 


5.575806532698892e-0()l 



TABLE III 
Optimal distribution for EAF 



Pr(X 


= 0) 


= 4.3752093552645e - 


001 


Pr(Xi 


= 0) 


= 1.9388669163312e - 


001 


Pr(X 2 = 


= 0) = 


= l.OOOOOOOOOOOOOOOe 


- 009 



rates for this channel produces 

R PTP = 0.2860323, 
r daf = 0.2408629, 
r ts-eaf = o 2924798, 

where the optimal distributions that achieve these rates are summarized in tables [H] and |^ The optimal DAF 
distribution fixes both X\ and Xi to 'V and sets the probability of X to be Yr(X = 1) = 0.442419, as expected 
for the case where the relays limit the achievable rate. For the EAF, the useless relay 2 is fixed to 0, to facilitate 
transmission with the useful relay 1. In accordance, we obtain time sharing proportions of q\ = 0.156947 and 
(72 ~ for relay 1 and relay 2 respectively. We note that in this scenario, we actually have that even the single-relay 
TS-EAF outperforms the two-relay DAF. 

3 The resulting rates were obtained by optimizing for the rates with random initial input distributions. The optimization was repeated 50 times for 
each rate and the maximum resulting rate was recorded. The m-files used for this evaluation are available at http : / / cn . ece ■ Cornell ■ edu| 
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IV. The Gaussian Relay Channel 

In this section we investigate the application of estimate-and-forward with time-sharing to the Gaussian relay 
channel. For this channel, the common practice it to use Gaussian codebooks and Gaussian quantization at the 
relay. The rate in Gaussian scenarios where coded modulation is applied, is usually analyzed by applying DAF at 
the relay. In this section we show that when considering coded modulation, one should select the relay strategy 
according to the channel condition: Gaussian selection seems a good choice when the SNR at the relay is low and 
DAF appears to be superior when the relay enjoys high SNR conditions. However, for intermediate SNR there is 
much room for optimizing the estimation mapping at the relay. 

In the following we first recall the Gaussian relay channel with a Gaussian codebook, and then we consider the 
Gaussian relay channel under BPSK modulation constraint. Since we focus on the mapping at the relay we consider 
here the Gaussian relay channel with an orthogonal relay of finite capacity C, also considered in [11]. This scenario 
is depicted in figure [5] 




N 



Fig. 3. The Gaussian relay channel with a finite capacity noiseless relay link between the relay and the destination. 

Here Y\ = g ■ X + N\ is the channel output at the relay, Y = X + N is the channel output at the receiver, 
which decodes the message based on (Y n , Y"). Let W = {1,2, ...,2 nR \ denote the source message set, and let 
the source have an average power constraint P: 

1 n 

-yxi(w)<P, VweW. 

n 

i— 1 

The relay signal F" is transmitted to the destination through a finite-capacity noiseless link of capacity C. For this 
scenario the expressions of [2, theorem 6] specialize to 

R < I(X; Y, Yi) (35a) 
subject to C > /(Yi;Yi|Y), (35b) 

with the Markov chain X, Y - Y± — Y x . 
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We also consider in this section the DAF method whose information rate is given by (see [2, theorem 1]) 

R DAF = min {I(X; Y 1 ), I(X; Y) + C} , 
and the upper bound of [2, theorem 3]: 

R upper = min {I(X: Y) + C, I(X; Y, Y,)} . 

We note that although these expressions were derived for the finite, discrete alphabets case, following the argument 
in [8, remark 30], they also hold for the Gaussian case. 

A. The Gaussian Relay Channel with Gaussian Codebooks 

When X ~ Af(0, P), i.i.d., then the channel outputs at the relay and the receiver are jointly Normal RVs: 

y \ r I 1 \ I p + v 2 gP 




yij V V / \ gP g 2 P + a\ 

The compression is achieved by adding to Y\ a zero mean independent Gaussian RV, Nq: 

Fj = Y x + N Q , N Q ~ Af(Q, a 2 Q ). (36) 

We refer to the assignment J36I as Gaussian-quantization estimate-and-forward (GQ-EAF). Evaluating the expres- 
sions d35ai and J35bi with assignment d36l results in (see also [11]): 

1 , ( gP 

2 log2 1 + P + TT^ 



I{X; Y, Yj.) = - log 2 [1 + P + rrr-r ( 37a ) 



Q . 



«^ = ^( 1 + ^fl- (3?b) 
The feasibility condition \35b\ yields 



2 1 + P + gP 
a Q - ( 2 2C_ 1 )( P + 1 )' 



and because maximizing the rate d37al requires minimizing erg, the resulting GQ-EAF rate expression is 

R< ilog 2 fl + P ' 



2 S2 iT J ^ -i | i+p+,p ~ ■ 

\ 1 + (2^'-l)(P+l) / 

Now, when using Gaussian quantization at the relay it is obvious that time sharing does not help: we need the 
minimum CTq in order to maximize the rate. This minimum is obtained only when the entire capacity of the relay 
link is dedicated to the transmission of the (minimally) quantized Y\. However, when we consider the Gaussian 
relay channel with coded modulation, the situation is quite different, as we show in the remaining of this section. 
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B. The Gaussian Relay Channel with Coded Modulation 

Consider the Gaussian relay channel where X is an equiprobable BPSK signal of amplitude y/P: 

Pr(X = \fP) = Prpf = -VP) = (38) 

Under these conditions, the received symbols (Y, Y% ) are no longer jointly Gaussian, but follow a Gaussian-mixture 
distribution: 

f(y, yi ) = Pr(X = </P)f(y,yi\x = VP) + Pr(X = -VP)f(y,yi\x = -VP) 
= \ (G y (VP,a 2 )G y AgVP,al)+G y (-VP,a 2 )G v A-gVP,al)) , 

where 

G x {a, b) ± -±=e-^ . (39) 

V27TO 

Contrary to the Gaussian codebook case, where it is hard to identify a mapping p{yi\yi) that will be superior 
to Gaussian quantization (if indeed such a mapping exists), in this case it is a natural question to compare the 
Gaussian mapping of d36l >. which induces a Gaussian-mixture distribution for Y\ with other possible mappings. In 
the case of binary inputs it is natural to consider binary mappings for Y\ . We can predict that such mappings will 
do well at high SNR on the source-relay link, when the probability of error for symbol-by-symbol detection at the 
relay is small, with a much smaller complexity than Gaussian quantization. We start by considering two types of 
hard-decision (HD) mappings: 

1) The first mapping is HD-EAF: The relay first makes a hard decision about every received Y\ symbol, 
determining whether it is positive or negative, and then randomly decides if it is going to transmit this 
decision or transmit an erasure symbol E instead. The probability of transmitting an erasure, 1 — P no erase , is 
used to adjust the conference rate such that the feasibility constraint is satisfied. Therefore, the conditional 
distribution p(Yi|Yi) is given by: 

P (Y 1 \Y 1 >0)={ Pnoe ' ase (40a) 

1 -fno erase j ^ 

P(Y 1 \Y 1 <0)={ Pnoe ' ase - 1 . (40b) 

1 Pno erase 3 ^ 

This choice is motivated by the time-sharing method considered in section [H] after making a hard decision 
on the received symbol's sign — positive or negative, the relay applies TS to that decision so that the rate 
required to transmit the resulting random variable is less than C. This facilitates transmission to the destination 
through the conference link. Since the entropy of the sign decision is 1, then when C > 1 we can transmit 
the sign decisions directly without using an erasure. Therefore, we expect that for values of C in the range 
C > 1, this mapping will not exceed the rate obtained for C = 1, The focus is, therefore, on values of C 
that are less than 1. The expressions for this assignment are given in appendix IA- Al 



February 1, 2008 



DRAFT 



23 



2) The second method is deterministic hard-decision. In this approach, we select a threshold T such that the 
range of Y\ is partitioned into three regions: Y\ < — T, —T <Y\ < T, Y\ > T. Then, according to the value 
of each received Y\ symbol, the corresponding Y\ is deterministically determined: 

f 1, Y x > T 

Yi = I E, -T <Yi <T ■ (41) 
[ -1, Fj < -T 

The threshold T is selected such that the achievable rate is maximized subject to satisfying the feasibility 
constraint. We refer to this method as deterministic HD (DHD). Therefore, this is another type of TS in 
which the erasure probability is determined by the fraction of the time the relay input is between — T to T. 
This method should be better than HD-EAF at high relay SNR since for HD-EAF, erasure is selected without 
any regard to the quality of the decision - both good sign decisions and bad sign decisions are erased with 
the same probability. However in DHD, the erased area is the area where the decisions have low quality in 
the first place and all high quality decisions are sent. However, at low relay SNR and small capacity for the 
relay-destination link, HD-EAF may perform better than DHD since the erased area (i.e. the region between 
— T to +T) for the DHD mapping has to be very large to allow 'squeezing' the estimate through the relay 
link, while HD-EAF may require less compression of the HD output. The expressions for evaluating the rate 
of the DHD assignment are given in appendix IA-B I 
We now examine the performance of each technique using numerical evaluation: first, we examine the achievable 
rates with HD-EAF. The expressions are evaluated for a\ = a 1 = 1 and P = 1. For every pair of values (g,C) 
considered, the maximum P no eiase was selected. Figure |4] depicts the achievable rate vs. g for 0.4 < C < 2, together 
with the upper bound and the decode-and-forward rate. As can be observed from figure |4] the information rate of 
HD-EAF increases with C until C = 1 and then remains constant. It is also seen that for small values of g, HD-EAF 
is better than DAE This region of g increases with C, and for C > 1 the crossover value of g is approximately 
1.71. However, even for g = 2, DAF is only 2.5% better than HD-EAF. 

Next, examine DHD: as can be seen from figure [5] for small values of C, DAF exceeds the information rate 
of DHD for values of g greater than 1, but for C > 0.8, DHD is superior to DAF, and in fact DAF approaches 
DHD from below. Another phenomena obvious from the figure (esp. for C = 0.8), is the existence of a threshold: 
for low values of C there is some g at which the DHD rate exhibits a jump. This can be explained by looking 
at figure |6l which depicts the values of I(X;Yi,Y) and I(Yi;Yi\Y) vs. the threshold T: the bold-solid graph of 
J(Yi; Fil^) can intersect the bold-dashed horizontal line representing C at two values of T. We also note that for 
small T the value of I(X; Yi, Y) is generally greater than for large T. Now, the jump can be explained as follows: 
as shown in appendix IA-B.1I for small T and g, I(Yi;Yi\Y) is bounded from below. Now, if this bound value is 
greater than C then the intersection will occur only at a large value of T, hence the small rate. When g increases, 
the value of I(Yi_; Y±\Y) for small T decreases accordingly, until at some g it intersects C for a small T as well 
as for a large T, as indicated by the arrow in the right-hand part of figure |6] This allows us to obtain the rates in 
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— Hard Decision, C = 0.4 

-0- Hard Decision, C = 0.6 

-x- Hard Decision, C = 0. 

— h— Hard Decision, C >= 1 

Upper Bound, C = 0.4 

-o- Upper Bound, C >= 0.6 

- DAF, C = 0.4 

e DAF, C >= 0.6 



1 1.2 1.4 

g - Relay Channel Gain 



Fig. 4. Information rate with BPSK and hard decision EAF mapping at the relay vs. relay channel gain g, for different values of C. 
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Fig. 5. Information rate with BPSK, for deterministic hard decision at the relay vs. relay channel gain g, for different values of C. 
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Fig. 6. J(Yi;Yi|y) and I(X;Y lt Y) vs. Threshold T for (g, C) = (0.4,0.8) (left) and (g,C) = (1.4,0.8) (right). The bold solid line 
represents I{Y\ ,Y\\Y), the bold dashed line represents C = 0.8, I(X\ Y, Yi) is represented by the dash-dot line and the resulting information 
rate is depicted with the solid line. 



the region of small T which are in general higher than the rates for large T and this is the source of the jump in 
the achievable rate. 

C. Time-Sharing Deterministic Hard-Decision (TS-DHD) 

It is clearly evident from the above numerical evaluation that none of the two mappings, HD-EAF and DHD, is 
universally better than the other: when g is small and C is less than 1, then HD-EAF performs better than DHD, 
since the erased region is too large, and when g increases, DHD performs better than HD-EAF since it erases only 
the low quality information. It is therefore natural to consider a third mapping which combines both aspects of 
binary mapping at the relay, namely deterministically erasing low quality information and then randomly gating 
the resulting discrete variable in order to allow its transmission over the conference link. This hybrid mapping is 



February 1, 2008 



DRAFT 



26 



given in the following equation: 



p(yi|Yi>T) = 

p(Yi=E\ |Fi|<T) = 1 
KY|Fi<-T) = 



no erase ; 



mio erase j 1 



(42a) 
(42b) 
(42c) 



In this mapping, the region |Yj.| < T is always erased, and the complement region is erased with probability 
Pemse = 1 — Pno erase- Of course, now both T and Perase have to be optimized. The expressions for TS-DHD can be 
found in appendix IA-CI Figure compares the performance of DHD, HD-EAF and TS-DHD. As can be seen, the 
hybrid method enjoys the benefits of both types of mappings and is the superior method. 



0.95 




1 1.2 1.4 

g - Relay Channel Gain 



Fig. 7. Information rate with BPSK, for HD-EAF, DHD and TS-DHD at the relay vs. relay channel gain g, for different values of C. 



Next, figure [8] compares the performance of TS-DHD, GQ-EAF, and DAF. As can be seen from the figure, 
Gaussian quantization is not always the optimal choice: for C = 0.6 (the lines with diamond-shaped markers) we 
have that GQ-EAF is the best method for g < 1.05, for 1.05 < g < 1.55 TS-DHD is the best method and for 
g > 1.55 DAF achieves the highest rate. For C = 1 (x-shaped markers) TS-DHD is superior to both GQ-EAF 
and DAF for g > 0.9 and for C = 2, GQ-EAF is the superior method for all g < 2. This suggests that for the 
practical Gaussian relay scenario, where the modulation constraint is taken into account, there is room to optimize 
the mapping at the relay since the choice of Gaussian quantization is not always optimal. 
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0.5 1 1.5 

g - Relay Channel Gain 



Fig. 8. Information rate with BPSK, for DAF, TS-DHD and GQ-EAF at the relay vs. relay channel gain g, for different values of C. 



Lastly, figure [9] depicts the regions in the g-C plane in which each of the methods considered here is superior, 
in a similar manner to [11, figure 2] 4 . As can be observed from the figure, in the noisy region of small g and 
also in the region of very large C, GQ-EAF is superior, and in the strong relay region of medium-to-high g and 
medium-to-high C, TS-DHD is the superior method. DAF is superior small C and high g. In a sense, the TS-DHD 
method is a hybrid method between the DAF which makes a hard-decision on the entire block and GQ-EAF which 
makes a soft decision every symbol, therefore it is superior in the transition region between the region where DAF 
is distinctly better, and the region where GQ-EAF is distinctly superior. 



4 The block shapes are due to the step-size of 0.2 in the values of g and C used for evaluating the rates. In the final version we will present 
an evaluation over a finer grid (such an evaluation requires several weeks to complete). 
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0.8 - 




0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 

C - Relay link capacity 



Fig. 9. The best cooperation strategy (out of DAF, TS-DHD and GQ-EAF) for the Gaussian relay channel with BPSK transmission. 

D. When the SNR on the Direct Link Approaches ( a 2 — > oo) 

In this subsection we analyze the relaying strategies discussed in this section as the SNR on the direct link 
X — Y approaches zero. Because TS-DHD is a hybrid method combining both DHD and HD-EAF, we analyze 
the behavior of the components rather than the hybrid, to gain more insight. This analysis is particularly useful 
when trying to numerically evaluate the rates, since as the direct-link SNR goes to zero, the computer's numerical 
accuracy does not allow to numerically obtain the rates using the general expressions. 

First we note that when the SNR of the direct link X — Y approaches we have that I(X; Y) — > as well. To 
see this we write 

I(X;Y) = h(Y)-h(Y\X) 

= h(Y) - h(X + N\X) 
= h(Y) - h(N), 



February 1, 2008 



DRAFT 



29 



with h(Y) = - f(y) Iog 2 (/(y))dy, and from fA3 



/(F) = ifG^^a^ + G^-VP.a 2 ) 



1 / 1 («- VP) 2 (y+N^P) 2 



2 V\/2^2 

y 2 ( \ yVP 1 _iv'? N 

= e 2^ — e » 2 H — e " 2 I e 2 ° 



V2^2 ^2 2 

1 , (yy/P S 

-e cosh — __ 



V21tct 2 



2 1 

a — >oo 1 y_ 

ze 2o 



\/27T(7 2 

= G y (0,a 2 ), 

_ » 2 

where the approximation is in the sense that for small \y\ we have cosh(|y|) ~ 1 and for large \y\, e ^ drives 

2 

— y 9 

the entire expression to zero as e 5^, for cr — » oo. This approximation reflects the intuitive notion that as the 
variance increases to infinity, the two-component, symmetric Gaussian mixture resembles more and more a zero- 
mean Gaussian RV with the same variance. Therefore, for low SNR, the output is very close to a zero-mean Normal 
RV with variance a 2 , and h(Y) w h(N), 5 hence 

I(X;Y) a2 ^° 0. 

Note that the upper bound and the decode-and-forward rate in this case are both equal to 

Rdaf = Rupper = min {C, I(X; Y ± )} . 
Now, let us evaluate the rate for HD-EAF as the SNR goes to zero. From ( I35at : 

R < I(X- Y, F) = I{X; Fl) + I{X- F|F X ), 

and 

/(X;F|F!) = ft(F|Fx) — h(Y\X,Yi) 

= Pr(F = l)/i(F|F x = 1) + Pr(F = £)/i(F|Fi = E) + Pr(F = -l)ft(F|Fi = -1) - h(N). 

5 For cr = 20 we have that /f^ |/y (y) - Gj,(0, cr 2 )|d?/ < 0.001, for cr = 55, h(Y) - h(N) a 0.001 and for cr = 200, h(Y) - h(N) < 
0.0001. 
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Using appendix 1X1 equations (IA.5> - iA.H . we have 

h(Y\fi = 1) = - / f Y fr (y\m = 1) log 2 (f Y fr (y\m = l)) dy, 

, / |- _ -I-, _ fY,Yi(y,Vl > 0)-Pno erase _ /y.Yx fa, Z/l > 0) 
- J - p r(Fi > eras£ - p r(Fi > Q) , 

/y,Yifa,yi > o) = i (jy^xiv^i > o\x = VP) + f Y , Yl \x(y,yi > o\x = -VP) 

= \ (G y (VP,CT 2 )Pr(Y 1 > 0|X = VP) + G y (-VP, c 2 )(l - Pr(Yi > 0|X = \/P)) 



=e -^r f p r (y 1 > Q\X = VP) + \e~^ (l - Pr(Yi > 0|X = \/P)) ) e^2 



=e 2^ - cosh — =— — o sinn — 



V2^ 
» ^(0,a 2 ), 



when <T 2 — > oo and (5 6 [— |, |1 is selected such that Pr(Yi > 0|X = \/P) = h — 5. The approximation in (a) is 
because for small \y\, sinh ( ^?r- \ ~ and cosh (^0-) ~ 1, and for large |y|, both e - ^ sinh f^^J — ► and 



e 5^ cosh ( ) — > 0. Hence 



( 1 1 _ X) /„=_„, 2 Pr(r! > 0) l0g2 {2Pr(Y 1 > 0) ) dV 



1 



2Pr(Y 1 >0) J y 
1 



G y (0 5( r 2 ) [log a (G,(0,a 2 )) - log 2 (2Pr(Y x >0))] 



ft(JV)+log 2 (2Pr(y 1 >0))] J 



2Pr(F a > 0) 

and using Pr(Yi > 0) = Pr(Y x < 0) = \ and h(Y\Yi = 1) = 7i(Y|Yi = -1), we obtain 

K Y \^l) ~ ^noeraseM^) + (1 - P„o erase )/l(JV) + ^Pno erase ft(iV) 

= h(N). 

Therefore, at low SNR, Y and Y x become independent. Then, I(X;Y\Yi) = 7i(F|Fi) - ft(iV) « and the 
information rate becomes (see appendix IA-E> 

R<I(X;fi) = H(Yi) — H(Yi\X) 

= P„oerase(l-ff(Pl,l--Pl)), 

where H(-) is the discrete entropy for the specified discrete distribution and Pi = Pr(Yi > 0\X = VP)- Now, 
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consider the feasibility condition C > I(Yi;Yi\Y): 

HY^YIY) = H(?i\Y) - HfolYuY) 
« H(Y) - HiYlY,) 

— p 

1 no erase j 

where (a) follows from the independence of Y and Y\ at low SNR, see appendix IA-EI Therefore, for low SNR, 
we set P no erase = min {C, 1} and the rate becomes 

R < min{C,l} (1 - H(P U 1- Pi)). 

For the GQ-EAF we first approximate /(Y, Yi) at low SNR starting with (IA.8> : 

fyftfatii) = \ (G y (^,a 2 )GyAgVP,al+al)+G y (-VP,a 2 )GyA-gVP,a 2 1+ a^) 



-G Sl (gVP, o\ + o&)e"**- + -G yi (-gVP, u\ + (foe*- e ^ 



_i_ y\/p , — , — 
as e w l in the region when G yi is significant, for both X = yi 3 or I = - y/P. We conclude that as the 



direct SNR approaches 0, Y and Y\ become independent. Now, the rate is given by: 

R < liX-Y.Y) 

= h(Y,Yi) - h(Y,Y\X) 

= h(Y) + h(Y 1 )-h(X + N,gX + N 1 +N Q \X) 

= h(Y) + h(Y 1 )-h(N,N 1 +N Q \X) 

= h(Y) - h(N\X) + h(Y) - him + N Q \X) 

= /(X;Y)+/(X;Yi) 

« TOY) 

= h{Y 1 )-h{N 1 +N Q ). (43) 
The feasibility condition becomes: 

C > UYnY^Y) 

= h(Y\Y) - hiY^Yx) 

« h{Y) - h(N Q ), (44) 

with 



h (&) = I (9^P, <t? + a%) + G yi (-gy/P, a 2 + a 2 ) 
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For DHD, as a 2 — > oo we have 

I{X;%Y) = I(X; Y) + I(X; Y\\Y) 

« J(^;y|y) 

= ff(^i|y)-fl"(ti|y,x) 

« ff(yi)-H(yi|x) 

where (a) follows from the independence of F and y as a 2 — * oo and the fact that Y\ is a deterministic function 
of Yi, combined with the fact that given X, Y\ and Y are independent. The feasibility condition becomes 

C>H(Yi\Y)aH(fi). 

Because I(X; y) is not a monotone function of T we have to optimize over T to find the actual rate. 

As can be seen from the expression for HD-EAF, when the SNR on the direct link decreases, the capacity of the 
conference link acts as a scaling factor on the rate of the binary channel from the source to the relay. In figure ITOl 
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Fig. 10. Information rate with DAF, DHD, HD-EAF and GQ-EAF vs. relay channel gain g, for different values of C, at low SNR on the 
source-relay link. 



we plotted the information rate for DHD, HD-EAF, GQ-EAF and DAF (which coincides with the upper bound). 
Comparing the three EAF strategies we note that DHD, which at intermediate SNR on the source-relay channel 
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performs well for C > 0.8, has the worst performance at low SNR up to C = 1.2. At C = 1.2, DHD becomes 
the best technique out of the three. For C < 1.2 and high SNR on the source-relay channel, HD-EAF outperforms 
both DHD and GQ-EAF. For low SNR on the source-relay channel, GQ-EAF is again superior. 



E. Discussion 

We make the following observations: 

• As noted at the beginning of this section, for low SNR on the source-relay link, GQ-EAF outperforms TS-DHD. 
To see why, consider the distribution of Y\\ 

frM = G.JO.ofJcoBh^^Je"^ 

f M) (!-§), 

where the approximation is obtained using the first order Taylor expansion, and the fact that for large values 
of Y\, G yi (0,af) dominates the expression. Therefore, as g — > 0, Y\ approaches a zero-mean Gaussian RV: 
Y\ —> jV(0, a\). As discussed in [24, ch. 13.1], the closer the reconstruction variable is to the original variable, 
the better the quantization performance are expected to be. Therefore it should be natural to guess that GQ 
will perform better at low relay link SNR. 

• At the other extreme, as g — > oo, consider the DAF strategy: as g — ► oo, have that 



h{Y r 



yi = -cc 



G yi (gVP,ai) + G yi (~gVP,a() 



log 2 ( - [G yi (gVP, a\) + G yi (-gVP, a\) 



dyi 



y 1= -oo 



-G yi (gVP, a() log 2 G m (gVP, a()d yi 



f°° 1 

/ A (-gVF, a\) log 2 G yi (-gVP, al)d Vl 

J yi =-oo 



and therefore, 



Hence, 



I{X;Yi) = hty-i) - h(Yi\X) w l + h(N{) - h(N{) = 1 = H{X). 
R DAF = mm{I(X;Y 1 ),I(X;Y)+C}=min{l,I{X;Y)+C}, 



which is the maximal rate. Therefore, as g — > oo DAF provides the optimal rate. 

We can expect that at intermediate SNR, methods that balance between the soft-decision per symbol of GQ- 
EAF and the hard-decision on the entire codeword of DAF, will be superior to both. Furthermore, we believe 
that as the SNR decreases, increasing the cardinality of Y\ accordingly will improve the performance. 
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V. Multi-Step Cooperative Broadcast Application 

In this section we consider the cooperative broadcast (BC) scenario. In this scenario, one transmitter communicates 
with two receivers. In its most general form, the transmitter sends three independent messages: a common message 
intended for both receivers and two private messages, one for each receiver, where all three messages are encoded 
into a single channel codeword X n . Each receiver gets a noisy version of the codeword, Yj" at R x i and Y£ at 
R X 2- After reception, the receivers exchange messages in a K-cycle conference over noiseless conference links of 
finite capacities C\2 and C21. Each conference message is based on the channel output at each receiver and the 
conference messages previously received from the other receiver, in a similar manner to the conference defined by 
Willems in [26] for the cooperative MAC. After conferencing, each receiver decodes its message. This scenario is 
depicted in figure^] This setup was studied in [12] for the single common message case over the independent BC 
(i.e. p(yi,y 2 |x) = Yl7=i p{yi,i\ x i)p(U2,i\xi)), an< ^ m [13] for the general setup with a single cycle of conferencing. 



w ,w 1 ,w 2 
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Fig. 11. The broadcast channel with cooperating receivers. The encoder sends three messages, a common message Wo, a private message to 
Rxi, W\, and a private message to R x 2, W2- Wq and Wo are the estimates of Wq at R x i and R x 2 respectively. 



A. Definitions 

We use the standard definition for the discrete memory less general broadcast channel given in [28]. We define a 
cooperative coding scheme as follows: 

Definition 5: A (Ci2,C2i)-admissible K-cycle conference consists of the following elements: 

1) K message sets from R x \ to R x i, denoted by Wx2 , W12 '>-->Wi2 , anc ^ ^ message sets from R X 2 to Rxi, 

(1) (2) (K) (k) p( fe ) (k) 

denoted by W21 , W21 ,--,rV2i ■ Message set W l2 consists of 2 na ^ messages and message set VV21 
consists of 2 r 21 messages. 

2) K mapping functions, one for each conference step from R xl to R X 2'- 

^ ) :^xW<;)xW<fx...xWr i, -W« 
and K mapping functions, one for each conference step from R X 2 to R x \: 

h%> : x x x ... x Wif ~ W&\ 
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K K 

(ft) 



where k = 1, 2, if. 
The conference rates satisfy: 

fe=l fc=l 

Definition 6: A (2 nR ° , 2 nRl , 2 nR2 , n, C\ 2 , C 2 \, K) code for the general broadcast channel with a common mes- 
sage and two independent private messages, consists of three sets of source messages, A4q — {l, 2, 2 nR °}, 
Mi = {1,2,..., 2 nRl } and M 2 = {1,2, 2" i?2 }, a mapping function at the transmitter, 

f :M xMxxM 2 ^X n , 

A (C12, C2i)-admissible if-cycle conference, and two decoders, 

.91 : x x ... x W^f } x ^i" ^Mox Mi, 

<? 2 : M2 x ^12° x ■■• x x 3# ^Mox M 2 . 

Definition 7: The average probability of error is defined as the average probability that at least one of the 
receivers does not decode its message pair correctly: 

Pi n) = Pr (.91 (W£\ W£\ W^,Y?) * (Mo, M x ) or g 2 ( W$ , W$ , W^f \ Y 2 ") ^ (M , M 2 )) , 

where we assume that each message is selected uniformly and independently over its respective message set. 

B. The Cooperative Broadcast Channel with Two Independent and One Common Message 

We first present the general result for the cooperative broadcast scenario with a 7\-cycle conference. Denote 
with Yi = {Y^\Y^\...,Y^ K) \ and Y 2 = (f 2 {1 \ Y 2 (2) , ...,f 2 (X) ). Let R x and R 2 be the private rates to R xl 
and R x2 respectively, and let Rq denote the rate of the common information. Then, the following rate triplets are 
achievable: 

Theorem 4: Consider the general broadcast channel (X ,p{yi,y 2 \x),y\ x y 2 ) with cooperating receivers, having 
noiseless conference links of finite capacities C\ 2 and C 2 \ between them. Let the receivers hold a conference that 
consists of K cycles. Then, any rate triplet (Rq, R\, R 2 ) satisfying 

R < rxiin{/(^;Yi,Y 2 ) , I (w; Y u Y 2 ) } (45a) 

Rx < I{U;Y U Y 2 \W) (45b) 

R 2 < I{V;Y U Y 2 \W) (45c) 

Ri+R 2 < I{U;Y X ,Y 2 \W) + I{V;Y U Y 2 \W) - I(U;V\W), (45d) 

subject to, 

C 12 > I(Y i; Y u Y 2 \Y 2 ) (46a) 

C 21 > /(r 2 ;Y 2 ,Yi|Fi), (46b) 
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.(1) -(2) ~(K) AX) J2) ~(K)\ 

p[w,u,v,x,y 1 ,y 2 ,y{',y{ J ,...,yl ,y 2 ,y 2 ',...,2/2 = 



(fc-i) 



for some joint distribution 

(w,u, 

I \ I \ \ ( -IX)\ \ ( »(lA "(1) "(2) ~(fc-l) .(1) .(2) 4, 

p{w,u,v,x)p{yi,y 2 \x)p ly{ '\yi j p [y 2 '\y 2 ,y{ ) ■ ■ -ph/i \yi,v\ ,2/1 ,-,2/i ,y 2 ,V 2 ,-,V 2 

/.(fc), .(1) .(2) „(fc) .(1) .(2) ~(k-l)\ f~(K), -(1) .(2) -(if-1) .(1) .(2) ~(K-1) 

p[y 2 12/2,2/1 ,vl ,-,Vi ,y 2 ,y\ )---p\y\ \v\,v\ ,v\ ,-,Vi ,y 2 ,y 2 ,-,y 2 

f~(K)> -(1) -(2) JK) -(1) .(2) JK-X)\ 

x P [y 2 '\y 2 ,y[',y[',...,y\ ',y 2 ',y y 2 y 2 (47) 
is achievable. The cardinality of the k 'th auxiliary random variables are bounded by: 

113^11 < ip>iiix n U3>pii x n wy^w+ 1, k=i,2,...,K 

1=1 1=1 

\\y { 2 k) w < n^iixnii^iixnii^ii+i, k=i,2,...,K. 
1=1 1=1 

Proof: 

1) Overview of Strategy: The coding strategy is based on combining the BC code construction of [29], after 
incorporating the common message into the construction, with the A'-cycle conference of [30]. The transmitter 
constructs a broadcast code to split the rate between the three message sets. This is done independently of the 
relaying scheme. Each receiver generates its conference messages according to the construction of [30]. After 
K cycles of conferencing each receiver decodes its information based on its channel output and the conference 
messages received from the other receiver. 

2) Code Construction at The Transmitter: 

• Fix all the distributions in (I47i . Fix e > and let n > 1. Let S > be a positive number whose value is 
determined in the following steps. Let R(W) = min |/ (\V; Yi, Y 2 ) , I (W;Yi,Y 2 S j j. Let S^ g denote 
the set of all w G W n sequences such that w G Ag'(W) and A*S n '(U, V\vt) is non-empty, as defined in 

y [W]S\ 



[23, corollary 5.11]. From [23, corollary 5.11] we have that \\S& ]S \\ > 2< H ^-^>, where <t> -> as S -> 



and n 

• Pick 2"( fl ( M/ ) _e ' sequences from Swi^ in a uniform and independent manner according to 



1 w (Z q( n ) 

) ' We b Ws 



Pr(w) = < " a iw]i 

, otherwise. 

Label these sequences with I G M a = {1,2, ...,2 n ^ w ^}. 

For each sequence w(7), I G M.q, consider the set A*Jf n ' (U\w(l)) ,5' = 5max{\\U\\, ||V||}. Since the sequences 
w G VV™ are selected such that A* s ^ n \u, V\w(l)) is non-empty and since (u,v) G A*^ n \u,V\w(l)) implies 
u G A$ n \u\w(l)), then also A* s $ n \u\w(l)) in non-empty, and by [23, theorem 5.9], \\A 5 i , n) {U\w(l))\\ > 

2 n{H{U\W)-^) ; ^ o as 5 ! o md n _^ ^ 

For each I G Ado pick 2™( / ( £/;Yl '^ 2 l w ')~ e ) sequences in a uniform and independent manner from A* s , (U\w(l)) 
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according to 



ue^([/|w(0) 



Pr(u|i) = \ Kn^il ' 5 

, otherwise. 



(V;Y u Y2\W)-e) 



Label these sequences with u(i\l), i G Zi = |l,2, 2"( / ( c/ ; y i>Y2|W)- e ) 1 similarly, pick 2"( / 
sequences in a uniform and independent manner from A* s ^ n \v\w(l)) according to 

f __l VG A$ n) (V\vr(l)) 

Pr(v|0 - ^ K^O^WI 

[ , otherwise. 

Label these sequences with v(j|Z), j £ 2 2 = {l, 2, 2™( / ( V 'Y 1 ,Y 2 |w)- e ) j § is se i ec ted such that \\S$ ]S \\ > 
2 n(R(W)-e)^ and V ; £ Mq we have that 1 1 ^*(") ([/| W (Z))| | > 2"« c/ ! Yl - Y2 l WO-*) and WA*^ (V\w(l))\\ > 

2n(I(V;Y 1 ,Y 2 \W)-e) 

Partition the set Z\ into 2 nRl subsets B W1 , Wi G Mi = {1,2, ...,2 niil }, let 



Rwi 



Similarly partition the set Z2 into 



Oi - i)2™( J "( c/ ; Y i. Y =lw)-- R i- e ) + i )t0l 2 n ( / ( ir ; y i- Y a|w-)--Ri-«) 

2™ fl2 subsets C W2 , w 2 G Ai 2 = {l, 2, 2 nii2 }, let 

C W2 - [0 2 - l)2«a(^Y 1 ,y 2 |W)-ii 2 -e) + ljW22 n(I(V;Y^Y 2 \W)-B. 2 -e) 

• For each triplet (/, Wi, w 2 ) consider the set 

T>(wx,w 2 \l) = |(mi,m 2 ) : mi G B Wll m 2 G C„ 2 , (u(toi|Z), v(m 2 |/)) G ^, (n) (<7, V|w(Z))} • 

By [29, lemma on pg. 121], we have that taking n large enough we can make Pr (||D(wi, w 2 |Z)|| = 0) < e 
for any arbitrary e > 0, as long as 

Ri < I(U;Y 1 ,Y 2 \W) (48a) 
Ri < I{V;Y U Y 2 \W) (48b) 
R1+R2 < I{U;Yi,Y 2 \W)+I{V;Yi,Y 2 \W)-I{U;V\W). ( 48c ) 

Note that the individual rate constraints are required to guarantee that the sets B Wl and C W2 are non-empty. 

• For each I G A4q, we pick a unique pair of (mi(wi,w 2 ,l),m 2 (wi,w 2 ,l)) G T>(wi,w 2 \l), (wi,w 2 ) £ 
Aii x Ai 2 . The transmitter generates the codeword x(l,wi,w 2 ) according to 

p(x(Z, 101, w 2 )) = n™=iP( a; il u i( TO i( w i7 w 2 , 0), Vi(m 2 (wi, w 2 , l)),Wi(l)). When transmitting the triplet (l,wi,w 2 ) 
the transmitter outputs x.(l,wi,w 2 ). 
3) Codebook Generation at the Receivers: 

B '(i) 

• For the first conference step from R x i to R x2 , R x i generates a codebook with 2 12 codewords indexed by 

(i) 
12 

,(i 

'12 



R x i uniformly and independently partitions the message set 2$ into 2 nR ^ subsets indexed by S W}^ ~ 



12 G -2$ = {l,2,...,2 nR ™} according to the distribution p ($f >): p (yf^)) = I\"=i P (tii} (41 
l x i uniformly and independently partitions the messa 
|l, 2, 2™ R i2 \. Denote these subsets with n 
For the first conference step from R x2 to R x i, R x , 
^ z 2i' ^ 2^1'' = |lj2, 2™ jR 2 < i ' I for each codeword yj 1 ' (zj; 2 '), 6 -Z^, in an i.i.d. manner according 



For the first conference step from R x2 to R x i, Rxi generates a codebook with 2"-" 2 i codewords indexed 
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to P (y2 1) (4i''l 2: i2 ) )) = EHLiP (y2^( z 2i ) l z i2'') Vi"} ( Z i2^))' ^2 uniformly and independently partitions the 
message set Z 2 \^ into 2 nR ^ subsets indexed by w 2 V e W 2 ^ = |l,2, 2™- R 2i ) j. Denote these subsets with 



21, to* 



For the fc'th conference step from R x i to i?.^, Rxi considers each combination of z^ 2 , z^ , z[ 2 1 \ 

fl) (2) (k— 1) D'( fe ) (k) 

z 2i i z 2\ ' 2i ■ P° r eacn combination, generates a codebook with 2™"i2 messages indexed by z}; 2 € 



.(fc-i) »(2) »(fc-i) 

'! 2/2 



-Z^ = jl^,...^"^'}, according to the distribution p (y[ k) lyj 1 ' , yj 2) , y[ k 1] ,y ( 2 \y ( 2 

(fc) c>( fc ) (k) 

R x i uniformly and independently partitions the message set Z\ 2 into 2™ 12 subsets indexed by w{ 2 € 
W{j' = |l,2, 2 nR ™ ). Denote these subsets with S (h) (k) . 

• The codebook for the fc'th conference step from R x2 to R x i is generated in a parallel manner for each 
combination of , zf 2 , ...,z[ 2 \ z 2 ^ , z^ , . . . , z 2 \~ ^ . 

4) Decoding and Encoding at R x \ at the k'th Conference Cycle (k < K)for Transmission Block i: R x \ needs first 
to decode the message z 2 i~ sent from R x2 at the (fc— l)'th cycle. To that end, R x \ uses «4i _1 ' ', the index received 
from R x2 at the (fc — l)'th conference step. In decoding z 2 \ X ' we assume that all the previous ', z 21 \ z|i ^ 

(k) (1) (2) (k — 2) 

were correctly decoded at Rxi. We denote the y 2 sequences corresponding to z 21 ,z 21 , ...,z 21 by 
y2(l),y2(2), ...,y 2 (fc - 2), and similarly define yi(l), yi(2), ...,yi(fc - 1). 

• R x i first generates the set C\(k — 1) defined by: 

L ) — S *21 fc ^"21 • I J2 \' c 21 1*12 ' 12 ' "12 i ^21 i ^21 i — I z 21 /i 

y 1 (l),y 1 (2),..,y 1 (fc-l),y 2 (l),y 2 (2),...,y 2 (fc-2),y 1 (i)) eif) 

• R x \ then looks for a unique z 21 ~ ^ -^21 sucn mat z 2i £ ^i(fc — 1) P| tS^ - ^i-i) ■ If there is none or 
there is more than one, an error is declared. 

• From an argument similar to [30], the probability of error can be made arbitrarily small by taking n large 
enough as long as 

^r^/f^-Vii^^ 



Here, fc > 1, since for the first conference message from R x \ to R X 2 no decoding takes place. 
In generating the fc'th conference message to R X 2, it is assumed that all the previous fc — 1 messages from R x2 
were decoded correctly. 



R x i looks for a message z[ 2 € Z^ 2 such that 



( 



^1 V'Yl 1*12 i ^12 1 •••) ^12 



? *21 1 ^21 ; ^21 



(fc-lh 



y 1 (l),y 1 (2),...,y 1 (fc-l), y2 (l), y2 (2),...,y 2 (fc-l),y 1 w) G 

From the argument in [30], the probability that such a sequence exists can be made arbitrarily close to 1 by 
taking n large enough as long as 

n'(fc) T fv(k). v V>(1) M2) V>(fe-1) t>(1) V>(2) V>(fc-1) 
il 12 /I Uj ; J 1 J l j J l ) ' " ) 1 t l 2 i 1 2 -i---i 1 2 
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• R x \ looks for the partition of Z 12 into which z 12 belongs. Denote the index of this partition with w 12 . 

• R x \ transmits w^ 2 to R x i through the conference link. 

5) Decoding and Encoding at R X 2 at the k'th Conference Step (k < K) for Transmission Block i: Using similar 
arguments to section fV-B. 41 we obtain the following rate constraints: 

(k) 

• Decoding z l2 at R x i can be done with an arbitrarily small probability of error by taking n large enough as 



long as 



R 



12 



(1) a>(2) 



(fc-1) ^>(2) V>(fe-1) 

, I 2 , * 2 , I 2 



R[f - , 



Ik) 

Encoding z 21 can be done with an arbitrarily small probability of error by taking n large enough as long as 



R'W> [Y^>;Y 2 



K*0. 



6) Combining All Conference Rate Bounds: First consider the bounds on R'i 2 \ k = 1,2, K: 
I\ 



4k) O(l) V>(2) 



I I i 1 ! ' 



•i 1 1 ? 1 2 i 2 S 



V>(fc-1) 
2 ' •••! 1 2 



' Ml ' r 2|^i , JT 1 7 J 2 ! J 2 )---i J 2 



'12 



This can be satisfied only if 



uy^-y 2 \y?\y?\... 



Y 



R 



(fc) 



/(y^y 



12 >iJ(y 1 (fc) |r2,rr,rr,. 



vK 1 ) v( 2 ) V 



2 i^2 ' * * * ' 2 



(fc-i) v(l) VM 2 ) 



(fc-1) 



i? 



(fe) 



12 



e > 



7 1 2 ' 2 



D-(fc-i) 



Kl) X>(2) 



V>(fc-1) ^>(1) ^>(2) w(fc-l) 
■) J l 7 2 2 7 2 2 7 •••) 1 2 



H 



v i>(i) V W <X fe -i) S>< 2 ) V 

7 1 7"-7- I l l J 2 > 2 !-"!- t 2 



(fc-1) 



MvW-vJv; v* 1 ' ^(fc- 1 ) v(!) v( 2 ' v( fe_1 ) 

M l j'1 '2,Ji ,-11 j---) 2 ! 7 1 2 i 1 2 7---7 r 2 



2r 
•2e. 



Hence 



K 



E* 

fe=i 

K 



(fc) 
12 



fe=i 

K 



(1) i>(2) v^" 1 ) 

j 1 I i 1 I ] J 2 7- I 2 i • 



.,K 



(fc-1) 



E 

fc=l 



K 



/(y^yly^yW.y/ 2 ), 



y v 

■7 *1 



(fc-i) v(i) v>( 2 ) 



V'-- 1 -' 1 V 
i 2 2 i 2 2 7 •••) 2 2 



(fc-1) 



2r 



+ j ( r 2 (fe) ; y |y 2 , fw, y{ 2 \ yf, y 2 (1) , y 2 (2) , ...,y 2 (fe - 1} 



= ^j(yW,FW ; y 1 |y 2 ,Y-w,F/ 2 ) J 
fe=i 

— 1 \ 1 1 7 1 1 7---7 J! l ! 2 2 ! 2 2 7' 



V(k-l) V>(1) V>(2) 
) 1 1 >'2 7^2 



■ 7 r 2 i y l 



vC 1 " 1 ) 

2 7 "-j 1 2 



2Ke 7 



+ 2Ke 
+■ 2A"e 



(49) 



and similarly 



C 21 >7 



(2) 



v(*0 v-(i) 

■) X 1 7 2 2 7 2 2 7' 



.,y 2 w ; y 2 |yi) +2^e. 



(50) 
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This provides the rate constraints on the conference auxiliary variables of ( 146 at and ( I46bl . 

7) Decoding at R x \: R x \ uses y\{i) and y 2 , y^ 2 , ...,y 2 K ' received from R X 2, to decode (k,wx,i) as follows: 
• R x i looks for a unique message I G Aio such 

(w(0,yi(<),^ ) ,^,...,^)e^ (n) - 

From the point-to-point channel capacity theorem (see [29]), this can be done with an arbitrarily small 
probability of error by taking n large enough as long as 

Ro < I(W;Y U Y 2 ). (51) 

Denote the decoded message U. Now R x \ decodes w-y^ by looking for a unique k G Z\ such that 

(uCfclfO.wCfO.yxW.y^.y^,...,^) e 

if a unique such fc exists, then denote the decoded index with k — k. Now R x \ looks for the partition of Z\ 
into which k belongs and sets w l t to be the index of that partition: k G B 1 j ]1 i . Similarly to the proof in [24, 
ch 14.6.2], assuming successful decoding of Z,, the probability of error can be made arbitrarily small by taking 
n large enough as long as 

^logaH-ZxH <I(U; Y U Y 2 \W), 

which is satisfied by construction. 

8) Decoding at R X 2'- Repeating similar steps for decoding at R X 2 we get that decoding U can be done with an 
arbitrarily small probability of error by taking n large enough as long as 

Ro < I(W;Y U Y 2 ), (52) 

and assuming successful decoding of U, decoding W2,i with an arbitrarily small probability of error requires that 

-log 2 ||Z 2 || <I(y;Yi,Yi\W), 
n 

which again is satisfied by construction. 

Finally, collecting j48a> . j48bL d48c> . d51> and J52I give the achievable rate constraints of theorem 0] and d49l 
and i|50j give the conference rate constraints of the theorem. ■ 

C. The Cooperative Broadcast Channel with a Single Common Message 

In the single common message cooperative broadcast scenario, a single transmitter sends a message to two 
receivers encoded in a single channel codeword X n . This scenario is depicted in figure [2] After conferencing, 
each receiver decodes the message. For this setup we have the following upper bound: 

Proposition 3: ([27, theorem 6]) Consider the general broadcast channel (X ,p(yi, 2/2 1 ^c) , 3^1 x 3 / 2) with cooper- 
ating receivers having noiseless conference links of finite capacities C12 and C21 between them. Then, for sending 
a common message to both receivers, any rate R must satisfy 

R< sup mm\l(X; Y^ + C 2 i,I(X; Y 2 ) + C 12 , 1(X; Y 1: Y 2 )}. 

px(x) L J 
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Broadcast Channel 
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Fig. 12. The broadcast channel with cooperating receivers, for the single common message case. W and W are the estimates of W at R x \ 
and R x 2 respectively. 



In [27] we also derived the following achievable rate for this scenario: 

Proposition 4: ([27, theorem 5]) Assume the broadcast channel setup of proposition^ Then, for sending a 
common message to both receivers, any rate R satisfying 



R < sup 

PxO) 



max • 



^Ri2(px(x)),R2i(px(x))^ , 

Ri2<px{x)) = min (i(X;Yl) + C 21 ,max{l(X;Y 2 ),I(X;Y 2 ) - H{Y 1 \Y 2 ,X) +min(Ci 2) if(yi|y 2 ))})j(53a) 
Rn(px{x)) = min (l(X; Y 2 ) + C 12 , max {l(X; Y^, I(X; Y x ) - H(Y 2 \Y 1 ,X) + min {C 2U H{Y 2 \Y 1 )) }) (53b) 



is achievable. 

Note that this rate expression depends only on the parameters of the problem and is, therefore, computable. In 
proposition |4] the achievable rate increases linearly with the cooperation capacity. The downside of this method is 
that it produces a rate increase over the non-cooperative rate only for conference links capacities that exceed some 
minimum values. 

Specializing the three independent messages result to the single common message case we obtain the following 
achievable rate with a K -cycle conference for the general BC with a single common message: 

Corollary 3: Consider the general broadcast channel with cooperating receivers, having noiseless conference 
links of finite capacities C\ 2 and C 2 \ between them. Let the receivers hold a conference that consists of K cycles. 
Then, any rate R satisfying 

R = max{i?i2, i?2i} , 

is achievable. 

Here R\ 2 is defined as follows: 



(54) 



with 



Ri 
R 2 



i?i2 = sup min {Ri, R 2 } . 

px(x),ae[0,l] 



I[X-Y u Y^\Y^\...,n K -^ 



atC 2 i, 



(1) t>(2) 



I[X;Y 2: Y^\YI 



,..,Yi 



(55) 

(56a) 
(56b) 
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subject to 



Y 2 ) , (57a) 
Yx) , (57b) 



Cu > l{Y^\Y^\...,Yl K \Y^\Y^,...,n K ^ 

(i-a)c M > /(^y^y/V,^^ 

for the joint distribution 

( .(1) .(2) -(1) -(2) 4K-i)\ 

p[x,yi,y2,yi ,vl ,-,Vi ,y 2 ,y 2 ,-,y 2 ) = 

I \ I I \ /-(l): \ M fe )l -(1) -(2) .(1) »(2) .(fc-l) N 

pwp(yi,y2k)p '| j/i J 2? \y\ 12/2,2/1 J • • -v \y\ m,y\ ,y\ ,-,y\ ,y 2 >2/ 2 > -^2 

( Jk)\ 41) .(2) .(1) .(2) -(k-l)\ ( ,(K-1), .(1) .(2) ~(if-l) -(1) -(2) ^{K-2)\ 

p[y2 m,y\ \y\ ',-,y\ ,y 2 ,2/2 ,-,y 2 )---p[y 2 m,Vi ,Vi ,-,yi ,y 2 ,y 2 y\ ) 

/-(if), .(1) „(2) ~{K-\) „(1) .(2) JK-V 

xp{vi \yi,Vi ,2/1 ,-,y\ ,2/2 ,f2 >->2/2 



77ie cardinality of the k 'th auxiliary random variables are bounded by: 

fc-i fc-i 

ii# fe) n < ii^iiix nn^°iix n \\y®\\+ !. fc=i,2,...,jf 

Z=l i=l 

fc-i 

ii^ii < » x n iij>«n x nii3>2 o n+i, k=i,2,...,K-i. 

1=1 1=1 

R21 is defined in a parallel manner to R\ 2 , with R x2 performing the first conference step, and the appropriate 
change in the probability chain. 

The proof of corollary [5] is provided in appendix 151 

We note that [12, theorem 2] presents a similar result for this scenario, under the constraint that the memory less 
broadcast channel can be decomposed as p(yi,y2|x) = n"=i P(yi.i\ x i)p{y2,i\xi), and considering the sum-rate of 
the conference. Here we show that the same achievable rate expressions hold for the general memoryless broadcast 
channel. A recent result appears in [31], where the single common message case for a Gaussian BC is considered. 
In the multi-cycle conference considered in this section, we let the auxiliary RVs follow a more general chain than 
that of [31] — which results in a larger achievable rate. 

D. A Single-Cycle Conference with TS-EAF 

Consider the case where only a single cycle of conferencing between the receivers is allowed. Specializing 
corollary [3] to a single cycle case we obtain 

Rx = I{X;Y 1 ) + C 2 i (58a) 

R 2 = I{X-Y 2 ,Y^) (58b) 

Cia > IfXnY^lYi), (58c) 



and the TS-EAF assignment is 



gi, y[ 1] =yi 



p(m \yi) 



i- 9 i, 2/r = ^^1. 
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Applying the TS-EAF assignment to J58cl and J58bi we obtain 



C12 > I(Y 1 -Y^ 1) \Y 2 ) 

= H(Y 1 \Y 2 )-H(Y 1 \Y 2 ,Y 1 {1) ) 
= H(Yi\Y 2 ) - q^Y^Y-y) - (1 - gi )iT(Yi|y 2 ) 
= qxH{Y x \Y 2 ) 
R 2 = I(X;Y 2 ,yV) 

= i(x- y 2 ) + h(x\y 2 ) - h(x\y 2 , y x (1) ) 

= I(X: Y 2 ) + H(X\Y 2 ) - (1 - qi )H(X\Y 2 ) - q 1 H(X\Y 2 ,Y 1 ) 

= /(X;y 2 ) + ft 7(x ; y 1 |y 2 ). 



Maximizing R 2 requires maximizing qi £ [0,1]. Therefore setting <7i 

C*12 



, we obtain i? 2 = I(X; Y 2 ) + 



/(X;Fi|y 2 ). Combining with i?i we have that the rate when R x2 decodes first is given by 



R 12 = min <^ I(X; Y,) + C 21 ,I(X; Y 2 



/(-x-;yi|y 2 ; 



H(Y 1 \Y 2 )_ 

and by symmetric argument we can obtain R 2 \. We conclude that the rate for the single-cycle conference with 
TS-EAF is given by 



R = sup min {i?i 2 , R 2 \} , 

p(x) 



R 12 = min|/(X;y 1 ) + C 21 ,/(X;y 2 ) 
R 2 1 = min(/(X;y^ 



H(Y 1 \Y 2 ) 
I(X;Y 2 \Y 1 ),I(X;Y 2 )+C 12 

We note that this rate is always better than the point-to-point rate and also better than the joint-decoding rate 
of proposition |4] (whenever cooperation can provide a rate increase). However, as in proposition |4] at least one 
receiver has to satisfy the Slepian-Wolf condition for the full cooperation rate to be achieved. We also note that 
using TS-EAF with more than two steps does not improve upon this result. 

Finally, we demonstrate the results of proposition @] and corollary [5] through a symmetric BC example: consider 
the symmetric broadcast channel where 3^ = y 2 = y and 

p Yl \Y 2 ,x{a\b,x) = pY 2 \ Yl ,x(a\b,x), 

for any a, b e ^xj and x € X. Let C 2 \ = C\ 2 = C. For this scenario we have that R\ 2 = R 2 \, in corollary [3] and 
also Ri 2 (px( x )) — R 2 i{px{x)) in proposition [4] The resulting rate is depicted in figure^] for a fixed probability 
p{x). We can see that for this case, time-sharing exceeds joint-decoding for all values of C. Both methods meet 
the upper bound at C = £f(yi|y 2 ). We note that this is a corrected version of the figure in [32]. 
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Upper bound (Prop. 3) 




H(Y,|Y 2 ^ H(Y,|Y 2 ) 



Fig. 13. The achievable rate R vs. conference capacity C, for proposition [3] (dashed-dot), proposition |4| (dashed) and corollary 
[3] (solid), for the symmetric broadcast channel. 

VI. Conclusions 

In this paper we considered the EAF technique using time-sharing on the auxiliary RVs. We first showed that 
incorporating joint-decoding at the destination into the EAF technique results in a special case of the classic EAF 
of [2, theorem 6]. We then used the time-sharing assignment of the auxiliary RVs to obtain an easily computable 
achievable rate for the multiple-relay case, which can be compared against the DAF-based results, to select the 
highest rate for any given scenario. Next, we showed that for the Gaussian relay channel with coded modulation, the 
Gaussian auxiliary RV assignment is not always optimal, and a TS-EAF implementing a per-symbol hard decision 
may sometimes perform better. Finally, we considered a third application of TS-EAF to the cooperative broadcast 
scenario with a multi-cycle conference. We first derived an achievable rate for the general channel, and then we 
specialized it to the single-cycle conference for which we obtained an explicit achievable rate. This rate is superior 
to the explicit expression that can be obtained with joint-decoding. 

VII. Acknowledgements 

In the final version. 

Appendix A 
Expressions for SectionITVI 

A. Hard-Decision Estimate-and-Forward 

We evaluate I(X;Y±,Y), withp(Y"i|Yi) given by d40al and flObl using: 

I{X- %Y) = I(X; Yi) + I(X- Y\Yi). 
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1) Evaluating I(X;Yi): Note that both X and Y\ are discrete RVs, therefore I(X;Y\) can be evaluated using 
the discrete entropies. The conditional distribution of Y\ given X is given by: 

{Pi ' Pio erase; 1 
1-Pno erase, E (A.1) 
(1 Pi ) Pio erase 7 1 

where 

Pi = Pr(Yi > Q\X = VP). 

p(Yi\X = — VP) can be obtained from = \/P) by switching 1 and —1 in (IA.U . 

2) Evaluating 2|X; Y|Yi): write first 

= h(Y\Y) - h(Y\Yi,X), 

and we note that 

h(X\Yi,X) = h(X + N\Y u X) = h(N\Y u X) = h(N) = hog 2 (2irea 2 ). 
Using the chain rule we write 

h(Y\Y 1 )=p(Y 1 = lMYlY = l)+p(Y 1 = E)h(Y\Yi = E)+p{Y 1 = -l)h(Y\Yi = -1), 
p(Yi) can be obtained by combining J38i and (IA.U which results in 

±P 1 

2 J no erase ; x 

p{Y x ) = { l-P noerase , E , (A.2) 
iP -1 

2 no erase ? x 

and we note that h(Y\Yi = E) = h(Y), since erasure is equivalent to no prior information. Finally we note 
that by definition 



where 



h{Y) = - f(y)\og 2 (f(y))dy, 

Jy=-co 

f(Y) = Pr{X = VP)f(Y\X = VP) + Pr{X = -V~P)f(Y\X = -y/P) 

= \ (G V (VP, <J 2 ) + Gyi-VP, a 2 )) , (A.3) 



G x (a,b) = -L=e~^ )1 . (A.4) 
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Next, we have 



poo 

h{Y\Y x = 1) = - / /(y|yi = 1) Iog 2 (/(y|yi = l))dy (A.5) 

J V — — 00 

/(Yin = i) 



'y=—oo 
/(YY = 1) 
Pr(Y = 1) 
/(Y,Y >0)F r 



no erase 



Pr(Y > 0)P no 

erase 

/(y,y>o) 



Pr(Y > 0) ' 

/(y, y > o) = Pr(x = \/p)/(y y > o|x = Vp) + p r (x = -Vp)/(y y > o|x = -VP) 



(A.6) 



Using 



we obtain 



i (/(y y > o|x = \/p) + /(y y > o|x = -VP)) . (A.7) 



fY, Yl {y,Vi\x)=M \ | I , I * ° | 1 =Gj / (o;,(r 2 )G , j;i (5-a;,^), 

/•OO /*00 

/(yy>0|X)=/ f{y,y 1 \x)dy 1 = G y {x,a 2 ) G yi (g ■ x, a\)dy x 



/yi=0 Jyi=0 

Next we need to evaluate Y^y) = /i(Y|Y) - /i(Y|Y Y): 
1) /i(Yi|Y) = /i(Y, Y) - h(Y). Here 



/•OO /*oo 

h<y,Y 1 ) = - / /(y,yi)log 2 (/(i/,yi))dydyi 

J U- — oo J V-I— — 00 



ill 

fy— — oo J yi — — oo 

/(Y Y) - ^(/(y,Y|x = Vp) + /(yy|x = -Vp)), 

/(YY|X) = G^^G^g-z,^). 
2) By the definition of conditional entropy we have 

/i(Y|Y Y) =p(Y = l)/i(Y|Y,Y = 1) +p(Y = £)h(Y|Y,y = £) +p(Y = -i)MY|Y Y = -l), 

where /i(Yi|Y, Yi = E) = h(Yi\Y), and for Y = 1, for example, we have 



/i(Y|Y, Y = 1) = - / / /(y.idlfc - l)log 2 (/(s/i|i/, s/i = l))dy 

«/ y— — oo J yi — — oo 

Finally, we need to derive the distributions f(y,yi\yi = 1) and f(yi\y,yi = 1)- Begin with 

/r^y^y, 2/1,2/1 = !) 



/^YiiYiG/'J/iIyi = 1 ) = 



Pr(m = 1) 



_ /,^(^,>o>^ _ f „ > 

Pr(yi > 0)P no erase 0, J/l < 

and due to the symmetry, Pr(Yi > 0) = Pr(Yi < 0) = \. We also have 

ffvwv - n = /(^i.m = i) _ /(Y,Y|Y>0) _ gKfg /(Yi,Y) 

n 11 ' 1 j /(Y|Y = i) f(Y\Y 1 >o) m^i /(yy>o)' 1>u 

/(Y|Y,Y = 1) = 0, Y<0. 
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B. Evaluation of the Rate with DHD 

We evaluate the achievable rate using I(X; Y,Y{) — I(X; Y\) + I(X; Y\Y). The distribution of Y\ is given by: 

Pr(Yi = 1) = Pr(Yi > T) = - (Pr(Yi > T\X = VP) + Pr{Y 1 > T\X = -VP)) 

= \ ( I Gyi (.9 VP, of)d yi + f G yi (-gVP, <r 2 )d yi 
L \Jyi>T Jyi>T 

Pr(ti = E) = Pr(|yi| < T) = 1 (Pr(|Yi| < T\X = VP) + Pr^l < T\X = -VP) 



\(f G yi (gVP,al)d yi + [ G yi (-gVP,af)dyi 
1 \Jyi=-T Jyi=-T / 



and by symmetry, Pr(Yi = 1) = Pr(Yi = -1) and H{Y X \X = VP) = H(Yi\X = -VP). Therefore, we need the 
conditional distribution p(Y\\X = VP)' 

Pr(Yi = 1\X = VP) = Pr(Yi > T\X = VP) = [ G yi (gVP, aj)d yi 

Jy x >T 

Pr(Yi = -l\X = VP) = Pr(Fi < -T\X = VP) = f G yi (gVP \a 2 )d yi 

J yi <-T 

Pr(Fi = E\X = VP) = 1 - Pr(Yi = 1\X = VP) - Pr(Yi = -1\X = VP). 
This allows us to evaluate I{X;Yi) = H{Y X ) - H(Yi\X). For evaluating I(X;Y\Yi) note that 

h{Y\Y u X) = h(X + N\Yi,X) = h{N\Y,X) = h(N) = i log 2 (27rea 2 ), 
and we need only to evaluate h(Y\Yi): by definition 

h{Y\Yi) = Pr(Yi = l)/i(Y|Yi = 1) + Pr(Yi = E)h(Y\Y t = E) + Pr(Yi = -l)h{Y\Y = -1), 
and note that h{Y\% = E) = h(Y). Finally, 



h{Y\Y x = 1) = - / f(y\yi = 1) log 2 (/(j/|yi = l))dy 

J y — — og 

/y^fvlvi = i) = > t) = f p^Y ± >T) 

fy, Yl (v,Vi >T) = l (f(y,yi > T\X = VP) + f(y, Vl > T\X = -VP) 



= \ (g v (VP, o 2 ) Pr(yi > T\X = VP) + G y (-VP, o 2 ) Pr(Yi > T\X = -VP) 
Evaluating /(Yi; Yi|Y) we have: 

I(Xi',Yi\Y) = H(Y\Y) - HiYlYY,) 
= H(Yx\Y) 

= H{fi) + h(Y\fi)-h(y), 

where (a) is due to the deterministic mapping from Yi to Y\, and h(Y) can be evaluated using ( IA.3I . 
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1) DHD when T — > 0: As T — > we have that Pr(Yi = E) — > and Yi converges in distribution to a Bernoulli 
RV with probability i. Therefore 



/(y,y 1 = l) = - (G y (VP,^)Pr(Y! >T\X = VP) + G v (-VP,* a )Pt(Y 1 >T\X = -VP) 



T^O 1 



- [Gy(VP, a 2 ) Pr(Y x > 0|X = VP) + G y (-VP t cr 2 ) Pr(Yi > 0|X = -VP) 



1 (G y (VP, a 2 )P + + G y (-VP, a 2 )(l - P+)) , 



where P + = Pr(Yi > Q\X = yP). Now, letting g — > we have that P + — * i and therefore 

/(y|y, = l) 9 ^T° /(F) 
&(r|Yi = i) 9 ^Mr° ^(y). 

We conclude that as g -> 0, T -> 0, then h(Y\Yi) -> ft(Y) and therefore the i"(Y x ; Yi|Y) becomes 

/(yuFilF) = #(Yi) + - my) 9 ^~"° 1 

Using the continuity of J(Yi;Yi|Y) we conclude that for small values of g, as T decreases then I(Yi;Yi\Y) is 
bounded from below. This implies that for small g and small C the feasibility is obtained only for large T, which 
in turn implies low rate. 



C. Evaluating the Information Rate with TS-DHD 
1) Evaluating I (X;Y,Yi): We first write 

I(X;Y,Y 1 )=I(X;Y 1 )+I(X;Y\Y 1 ). 

Evaluating I(X; Yi) = H{Y\) — H(Yi\X) requires the marginal of Y\. Using the mapping defined in J42t we find 
the marginal distribution of Y\: 

{1, (1 - Perase) Pr(Yl > T) 
E, PrflYiJ < T) + P erase Pr(|Fi| > T) , 
-1, (1 - Perase) Pr(Yl< -T) 

where 



Pr(Yi > T) = Pr(Yi < -T) = 



Pr(|yi|<T) = 



1 

yi=T 2 



G yi (VP, a 2 ) + G yi (-VP, a\)\ dy r 

G yi (VP,<J 2 ) + G Vl (-VP,a 2 )\ dyx. 

P), and therefore we need only to find 



'yi=-T ■ 

Also, due to symmetry we have that H(Yi\X = VP) = H(Yi\X 
the conditional Pr(Yi|X = VP): 

1, (1 ~ Perase) Pr(Yj > T\X = VP) 

Pr(Yi|X = VP) = { E, PrdFl < T|X = VP) + Perase Pr(|Y x | > T\X = VP) 

-1, (1 - Perase) Pr(Yj < -T|X = VP) 
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and we note that f Yl \x(yi\% = ^fP) = G Vl (y/P,af). 

Next, we need to evaluate I(X; Y\Yi) = h(Y\Yi) - h(Y\Yi,X). We first note that 

h(Y\Yx,X) = h(X + N\X,Yi) = h(N\X,Yi) = h(N) = i log 2 (27rea?). 

Lastly, we have 

HYIY) = Pr(Yi = l)h{Y\Yx = 1) + Pr(Yi = E)h{Y\Y x = E) + Pr(F a = -l)/i(F|Yi = -1). 

We note that h{Y\Y\ = E) = h{Y) and that h{Y\Y\ = 1) and h(Y\Y\ = —1) are calculated exactly as in appendix 
lA-Bl for the DHD case. 

2) Evaluating I(Yi;Yi\Y): Begin by writing 

IiY-Y^Y) = hiYlYt) - hiYlY^Y) 

= h(Y\Y) + H(Y) - h(Y) - hiY^) 

where we used the fact that given Y\, Yi is independent of Y. All the terms in the above expressions have been 
calculated in the previous subsection, except h(Yi\Yi): 

h(Yi\Yi) = Pr(fx > T)h(Yi\Yi > T) + Pr(|Yi| < T)h(Yi\\Yi\ < T) + Pr(Y 1 < -T)h{Yi\Yi < -T) 

= Pr(fx > T)tf(P erase , 1 - P erase ) + Pr(F X < -T)H (P e rase, 1 - Perase) 

= (1 - P(\Yi\ < T)i/(P erase , 1 - P erase ). 

D. Gaussian-Quantization Estitnate-and-Forward 
Here the relay uses the assignment of equation d36i : 

Y 1 = Y 1 +N Q , N Q ~Af(0,a 2 Q ). 

We first evaluate 

I(X; Y, Y) = h(Y, Yi) - h(Y, Yi \X) : 

1) 

poo poo 

H Y > Y ) = - I I f YYi (y,yi) log 2 (f Y ^ (y,yi))dy dyi 

J y— — oo J yi — — oo 

/y.fifo.fc) = i(G y (VP,a 2 )G Sl (. 9 x/P,^+a^ (A.8) 
2) We also have 

h(Y,Yi\X) = h(X + N,gX + Ni + N Q \X) 
= h(N,Ni+N Q \X) 
= h{N) + h(Ni + N Q ) 
= ilog 2 ((2 7 re)V(cr? + ^)). 
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Lastly we need to evaluate 

I(Yi]Y 1 \Y) = h(Yi\Y) - h(Yi\Y u Y) = h(Y u Y) - h{Y) - h{Y\Y u Y), 

where 

h(Yi\Y u Y) = h(Yi + N Q \Y U Y) = h(N Q \Y u Y) = h(N Q ) = i log a (27reo£ 

E. Approximation of HD-EAF for a 2 — > oo 
Using (IA.1> and (IA.2> we can write 

R<I(X;Y) = Hfo) - H(Yi\X) 

— H I — -fno erase j 1 Pno erase 3 7T -^no erase J ( Pi Pro erase 3 1 Pno erase j ( 1 ^1 ) Pi 



no erase y 



-fno erase 1*^2 ^2^ =>n ° erase y ^ ^no erase) lo&S-C^" ~~ Pno erase) ~h PlPno erase ^^(-^l-fno erase) 
H"(l — -^no erase) 1°§2(^ — ^no erase) + (1 ~~ Pi) Pno erase 1°S2 ((^ ~ Pi) Pno erase) 

— -fno erase lo§2 (-^~no erase) ~f~ -fno erase H~ Pi Pno erase log2(-f > l) PlPno erase lo^v^&o erase) 
+ (l-Pl)P no erase log 2 (l-Pi) + (l-Pi)P no erase log 2 (P no erase ) 

= Pno el -ase(l + Pi log 2 (Pl) + (1 - Pi) log a (l - Pi)) 

= Pnoe raS e(l-P(Pl,l-Pl)). 

KY^YIY) = h(Y\Y) - hiYlY^Y) 
« H{Yi) - HfclY!) 

P ^2^ n0 erase ' ^ ^ >no erase ' ^^ >no erase ^) P(Pio erase j 1 Pio erase) 

2 7" Pno erase log 2 ( o^rio erase J (1 Pro erase) log 2 (1 Pno erase) ~l~ Pio erase log 2 (P no erase) 



( 1 — Pro erase ) l°g 2 (1 — Pio erase ) 

— P 

1 no erase j 

where in (a) we used the fact that Y\ and Y are independent as a 2 — > oo, and that given Y\, Y\ is independent of 
Y. 

Appendix B 
Proof of Corollary[3] 

In the following we highlight only the modifications from the general broadcast result due to the application of 
DAF to the last conference step from R x i to R X 2, and the fact that we transmit a single message. 

1) Codebook Generation and Encoding at the Transmitter: The transmitter generates 2 nR codewords x in an 
i.i.d. manner according to p(x(w)) = Yl7=iP( Xi ( w ))> w E W = {1,2, ...,2 nfl }. For transmission of the message 
Wi at time i the transmitter outputs x(wi). 
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2) Codebook Generation at the R x \: The K conference steps from R x \ to R X 2 are carried out exactly as in 
section IV-B .41 The first K — 1 steps from R X 2 to R x i are carried out as in section lV-B.51 The A''th conference step 
from R X 2 to R x \, is different from that of theorem |4] as after the A"th step from R x i to R X 2, Rxi may decode 
the message since R X 2 received all the A conference messages from R x \. Then, R X 2 uses decode-and-forward for 
its A'th conference transmission to Rxi. Therefore, R X 2 simply partitions W into 2 naC21 subsets in a uniform and 
independent manner. 

3) Encoding and Decoding at the K'th Conference Step from R X 2 to R x \: 

• Before the A'th conference step, R x2 decodes its message using his channel input and all the K conference 
messages received from R x i. This can be done with an arbitrarily small probability of error as long as d56bl > 
is satisfied. 

• Having decoded its message, R X 2 uses the decode-and-forward strategy to select the K'th conference message 
to R x i. The conference capacity allocated to this step is R^ = aC2i- 

• Having received the A"th conference message from R X 2, R x i can now decode its message using the information 
received at the first K — 1 steps, and combining it with the information from the last step using the decode- 
and-forward decoding rule. This gives rise to ( I56al . 

f(k) 

4) Combining All the Conference Rate Bounds: The bounds on i? 12 , k = 1,2,..., A can be obtained as in 
section fV-B~6l 

K 

C12 = y^-Rffi 

fc=i 

and similarly 

{l-a)C2i>l{YP,Y?\...,n K \Y?\ 

where (1 — a)C2i is the total capacity allocated to the first K — 1 conference steps from R X 2 to R x \. This provides 
the rate constraints on the conference auxiliary variables. 
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